+```
+_视图方法里面怎样?_
+
+
+```python
+ def update_profile(request, user_id):
+ user = User.objects.get(pk=user_id)
+ user.profile.bio = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit...'
+ user.save()
+```
+一般来说,你将永远不会调用Profile的save方法。一切都是通过User模型完成的。
+
+_如果我使用Django表单呢?_
+
+你知道你可以同时处理多个表单吗?看看这个片段:
+
+**forms.py**
+
+```python
+ class UserForm(forms.ModelForm):
+ class Meta:
+ model = User
+ fields = ('first_name', 'last_name', 'email')
+
+ class ProfileForm(forms.ModelForm):
+ class Meta:
+ model = Profile
+ fields = ('url', 'location', 'company')
+```
+**views.py**
+
+```python
+ @login_required
+ @transaction.atomic
+ def update_profile(request):
+ if request.method == 'POST':
+ user_form = UserForm(request.POST, instance=request.user)
+ profile_form = ProfileForm(request.POST, instance=request.user.profile)
+ if user_form.is_valid() and profile_form.is_valid():
+ user_form.save()
+ profile_form.save()
+ messages.success(request, _('Your profile was successfully updated!'))
+ return redirect('settings:profile')
+ else:
+ messages.error(request, _('Please correct the error below.'))
+ else:
+ user_form = UserForm(instance=request.user)
+ profile_form = ProfileForm(instance=request.user.profile)
+ return render(request, 'profiles/profile.html', {
+ 'user_form': user_form,
+ 'profile_form': profile_form
+ })
+```
+**profile.html**
+
+```html
+
+```
+_以及你在说的额外的数据库查询呢?_
+
+噢,是的。我在另一篇名为“优化数据库查询”的文章中处理了这个问题。你可以[点击这里](http://simpleisbetterthancomplex.com/tips/2016/05/16/django-tip-3-optimize-database-queries.html)来看一看。
+
+但,长话短说:Django的关系是惰性的。意味着,Django只有在你访问其中一个相关属性的时候才会进行数据库查询。有时候,这引发了一些期望外的效果,例如触发数百上千的查询。这个问题可以使用`select_related`方法来减缓。
+
+事先知道你将需要访问的相关数据,你可以在一个单一的数据库查询中预取:
+
+```python
+ users = User.objects.all().select_related('profile')
+```
+* * *
+
+#### 使用一个扩展AbstractBaseUser的自定义的模型来扩展User模型
+
+令人心惊的一个选择。好吧,老实说,我都是不惜一切代价来避免使用它的。但有时候,你无法避免。并且它完全可行。几乎没有一件事(像它一样)既是天使又是魔鬼。在大多数情况下,或多或少有个合适的解决方案。如果这在你当前情况下是最合适的解决方法,那就看下去吧。
+
+我必须一次做完。老实说,我不知道这是否是做到这点更清晰的方式,但是,不管那么多了:
+
+我需要将电子邮件地址作为身份验证令牌,而在此场景下,`username`对我完全没用。另外,也不需要`is_staff`标志,因为我没有使用Django Admin。
+
+下面是我如何定义我自己的用户模型的:
+
+```python
+ from __future__ import unicode_literals
+
+ from django.db import models
+ from django.contrib.auth.models import PermissionsMixin
+ from django.contrib.auth.base_user import AbstractBaseUser
+ from django.utils.translation import ugettext_lazy as _
+
+ from .managers import UserManager
+
+
+ class User(AbstractBaseUser, PermissionsMixin):
+ email = models.EmailField(_('email address'), unique=True)
+ first_name = models.CharField(_('first name'), max_length=30, blank=True)
+ last_name = models.CharField(_('last name'), max_length=30, blank=True)
+ date_joined = models.DateTimeField(_('date joined'), auto_now_add=True)
+ is_active = models.BooleanField(_('active'), default=True)
+ avatar = models.ImageField(upload_to='avatars/', null=True, blank=True)
+
+ objects = UserManager()
+
+ USERNAME_FIELD = 'email'
+ REQUIRED_FIELDS = []
+
+ class Meta:
+ verbose_name = _('user')
+ verbose_name_plural = _('users')
+
+ def get_full_name(self):
+ '''
+ Returns the first_name plus the last_name, with a space in between.
+ '''
+ full_name = '%s %s' % (self.first_name, self.last_name)
+ return full_name.strip()
+
+ def get_short_name(self):
+ '''
+ Returns the short name for the user.
+ '''
+ return self.first_name
+
+ def email_user(self, subject, message, from_email=None, **kwargs):
+ '''
+ Sends an email to this User.
+ '''
+ send_mail(subject, message, from_email, [self.email], **kwargs)
+```
+
+我想让它尽可能接近现有的User模型。由于我们是从`AbstractBaseUser`继承的,因此必须遵循一些规则:
+
+ * **USERNAME_FIELD**: 一个描述User模型名字字段的字符串,作为唯一标识。该字段必须唯一 (即,在其定义中,必须设置`unique=True`);
+ * **REQUIRED_FIELDS**: 一个字段名列表,用于当通过`createsuperuser`管理命令创建一个用户时的提示;
+ * **is_active**: 一个布尔值属性,表示用户是否被认为是“活跃的(active)”;
+ * **get_full_name():** 用户的一个更长的正式标识符。一个常见的理解是用户的全名,但它可以是标识该用户的任何字符串。
+ * **get_short_name():** 用户的一个简短的非正式标识符。一个常见的理解是用户的名。
+
+好吧,让我们继续。我还必须定义自己的`UserManager`。这是因为现有的manager定义了`create_user`和`create_superuser`方法。
+
+所以,下面是我的`UserManager`的样子:
+
+```python
+ from django.contrib.auth.base_user import BaseUserManager
+
+ class UserManager(BaseUserManager):
+ use_in_migrations = True
+
+ def _create_user(self, email, password, **extra_fields):
+ """
+ Creates and saves a User with the given email and password.
+ """
+ if not email:
+ raise ValueError('The given email must be set')
+ email = self.normalize_email(email)
+ user = self.model(email=email, **extra_fields)
+ user.set_password(password)
+ user.save(using=self._db)
+ return user
+
+ def create_user(self, email, password=None, **extra_fields):
+ extra_fields.setdefault('is_superuser', False)
+ return self._create_user(email, password, **extra_fields)
+
+ def create_superuser(self, email, password, **extra_fields):
+ extra_fields.setdefault('is_superuser', True)
+
+ if extra_fields.get('is_superuser') is not True:
+ raise ValueError('Superuser must have is_superuser=True.')
+
+ return self._create_user(email, password, **extra_fields)
+```
+
+基本上,我已经完成了对现有`UserManager`的清理,移除`username`和`is_staff`属性。
+
+现在是最后一步。我们必须更新我们的settings.py。更具体的是`AUTH_USER_MODEL`属性。
+
+```python
+ AUTH_USER_MODEL = 'core.User'
+```
+这样,我们告诉Django使用我们自定义的模型,而不是默认的那个。在上面的例子中,我在一个名为`core`的app中创建了这个自定义模型。
+
+_我应该如何引用这个模型呢?_
+
+好,有两种方式。想想一个名为`Course`的模型:
+
+```python
+ from django.db import models
+ from testapp.core.models import User
+
+ class Course(models.Model):
+ slug = models.SlugField(max_length=100)
+ name = models.CharField(max_length=100)
+ tutor = models.ForeignKey(User, on_delete=models.CASCADE)
+```
+这是完全没问题。但是,如果你正在创建一个可重复使用的app,并且你想将其公开,那么我们强烈建议你使用以下策略:
+
+```python
+ from django.db import models
+ from django.conf import settings
+
+ class Course(models.Model):
+ slug = models.SlugField(max_length=100)
+ name = models.CharField(max_length=100)
+ tutor = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
+```
+* * *
+
+#### 使用一个扩展AbstractUser的自定义的模型来扩展User模型
+
+这是非常简单明了,因为`django.contrib.auth.models.AbstractUser`类,作为一个抽象模型,提供了默认的User的完整实现。
+
+```python
+ from django.db import models
+ from django.contrib.auth.models import AbstractUser
+
+ class User(AbstractUser):
+ bio = models.TextField(max_length=500, blank=True)
+ location = models.CharField(max_length=30, blank=True)
+ birth_date = models.DateField(null=True, blank=True)
+```
+然后,我们必须更新我们的settings.py,定义`AUTH_USER_MODEL`属性。
+
+```python
+ AUTH_USER_MODEL = 'core.User'
+```
+以与前面的方法类似的方式,理想情况下,这应该在一个项目的开头完成,并且应该小心谨慎。它会改变整个数据库模式。此外,喜欢创建外键的用户模型导入设置from django.conf import settings,并指settings.AUTH_USER_MODEL直接引用自定义用户模型来代替。
+
+此外,创建到User模型的外键,导入配置`from django.conf import settings`,以及引用`settings.AUTH_USER_MODEL`而不是直接引用自定义的User模型,这样会更好。
+
+* * *
+
+#### 总结
+
+好的!我们通过四种不同的方式来扩展现有的用户模型。我试着尽可能多的告诉你细节。正如我以前说过的,没有_最好的解决方案_。这将真正取决于你需要达到的目标。保持简单,并且明智地选择。
+
+ * **Proxy模型:** 你对Django User提供的一切都感到满意,并且不想要存储额外的信息。
+ * **User Profile:** 你对Django处理鉴权的过程感到满意,并且需要添加一些鉴权无关的属性到User。
+ * **继承自AbstractBaseUser的自定义的User模型:** Django处理鉴权的方式并不适合你的项目。
+ * **继承自AbstractUser自定义的User模型:** Django处理鉴权的方式非常适合你对项目,但你仍想要添加额外的属性,而不想要创建一个单独的模型。
+
+不要犹豫,问我问题吧,或者告诉我你对这篇文章的看法!
+
+你也可以[加入我的邮件列表](http://eepurl.com/b0gR51)。每周,我直接发送专属提示到你的邮箱! :-)
diff --git "a/Django/\345\270\246django\346\225\231\347\250\213\347\232\204Facebook\350\201\212\345\244\251\346\234\272\345\231\250\344\272\272\357\274\214\345\217\210\345\220\215\347\254\221\350\257\235\346\234\272\345\231\250\344\272\272.md" "b/Web/Django/\345\270\246django\346\225\231\347\250\213\347\232\204Facebook\350\201\212\345\244\251\346\234\272\345\231\250\344\272\272\357\274\214\345\217\210\345\220\215\347\254\221\350\257\235\346\234\272\345\231\250\344\272\272.md"
similarity index 100%
rename from "Django/\345\270\246django\346\225\231\347\250\213\347\232\204Facebook\350\201\212\345\244\251\346\234\272\345\231\250\344\272\272\357\274\214\345\217\210\345\220\215\347\254\221\350\257\235\346\234\272\345\231\250\344\272\272.md"
rename to "Web/Django/\345\270\246django\346\225\231\347\250\213\347\232\204Facebook\350\201\212\345\244\251\346\234\272\345\231\250\344\272\272\357\274\214\345\217\210\345\220\215\347\254\221\350\257\235\346\234\272\345\231\250\344\272\272.md"
diff --git a/Flask/README.md b/Web/Flask/README.md
similarity index 100%
rename from Flask/README.md
rename to Web/Flask/README.md
diff --git a/raw/An Introduction to Stock Market Data Analysis with Python (Part 2).md b/raw/An Introduction to Stock Market Data Analysis with Python (Part 2).md
new file mode 100644
index 0000000..bf8d166
--- /dev/null
+++ b/raw/An Introduction to Stock Market Data Analysis with Python (Part 2).md
@@ -0,0 +1,1311 @@
+原文:[An Introduction to Stock Market Data Analysis with Python (Part 2)](https://ntguardian.wordpress.com/2016/09/26/introduction-stock-market-data-python-2/)
+
+---
+
+*This post is the second in a two-part series on stock data analysis using Python, based on a lecture I gave on the subject for [MATH 3900 (Data Mining) at the University of Utah](http://datasciencecourse.net/2016/index.html) [(read part 1 here)](https://ntguardian.wordpress.com/2016/09/19/introduction-stock-market-data-python-1/). In these posts, I will discuss basics such as obtaining the data from Yahoo! Finance using **pandas**, visualizing stock data, moving averages, developing a moving-average crossover strategy, backtesting, and benchmarking. This second post discusses topics including divising a moving average crossover strategy, backtesting, and benchmarking, along with practice problems for readers to ponder.
+
+**_NOTE: The information in this post is of a general nature containing information and opinions from the author's perspective. None of the content of this post should be considered financial advice. Furthermore, any code written here is provided without any form of guarantee. Individuals who choose to use it do so at their own risk._**
+
+## Trading Strategy
+
+Call an **open position** a trade that will be terminated in the future when a
+condition is met. A **long** position is one in which a profit is made if the
+financial instrument traded increases in value, and a **short** position is on
+in which a profit is made if the financial asset being traded decreases in
+value. When trading stocks directly, all long positions are bullish and all
+short position are bearish. That said, a bullish attitude need not be
+accompanied by a long position, and a bearish attitude need not be accompanied
+by a short position (this is particularly true when trading stock options).
+
+Here is an example. Let's say you buy a stock with the expectation that the
+stock will increase in value, with a plan to sell the stock at a higher price.
+This is a long position: you are holding a financial asset for which you will
+profit if the asset increases in value. Your potential profit is unlimited,
+and your potential losses are limited by the price of the stock since stock
+prices never go below zero. On the other hand, if you expect a stock to
+decrease in value, you may borrow the stock from a brokerage firm and sell it,
+with the expectation of buying the stock back later at a lower price, thus
+earning you a profit. This is called **shorting a stock**, and is a short
+position, since you will earn a profit if the stock drops in value. The
+potential profit from shorting a stock is limited by the price of the stock
+(the best you can do is have the stock become worth nothing; you buy it back
+for free), while the losses are unlimited, since you could potentially spend
+an arbitrarily large amount of money to buy the stock back. Thus, a broker
+will expect an investor to be in a very good financial position before
+allowing the investor to short a stock.
+
+Any trader must have a set of rules that determine how much of her money she
+is willing to bet on any single trade. For example, a trader may decide that
+under no circumstances will she risk more than 10% of her portfolio on a
+trade. Additionally, in any trade, a trader must have an **exit strategy**, a
+set of conditions determining when she will exit the position, for either
+profit or loss. A trader may set a **target**, which is the minimum profit
+that will induce the trader to leave the position. Likewise, a trader must
+have a maximum loss she is willing to tolerate; if potential losses go beyond
+this amount, the trader will exit the position in order to prevent any further
+loss (this is usually done by setting a **stop-loss order**, an order that is
+triggered to prevent further losses).
+
+We will call a plan that includes trading signals for prompting trades, a rule
+for deciding how much of the portfolio to risk on any particular strategy, and
+a complete exit strategy for any trade an overall **trading strategy**. Our
+concern now is to design and evaluate trading strategies.
+
+We will suppose that the amount of money in the portfolio involved in any
+particular trade is a fixed proportion; 10% seems like a good number. We will
+also say that for any trade, if losses exceed 20% of the value of the trade,
+we will exit the position. Now we need a means for deciding when to enter
+position and when to exit for a profit.
+
+Here, I will be demonstrating a [moving average crossover strategy](http://www
+.investopedia.com/university/movingaverage/movingaverages4.asp). We will use
+two moving averages, one we consider "fast", and the other "slow". The
+strategy is:
+
+ * Trade the asset when the fast moving average crosses over the slow moving average.
+ * Exit the trade when the fast moving average crosses over the slow moving average again.
+
+A long trade will be prompted when the fast moving average crosses from below
+to above the slow moving average, and the trade will be exited when the fast
+moving average crosses below the slow moving average later. A short trade will
+be prompted when the fast moving average crosses below the slow moving
+average, and the trade will be exited when the fast moving average later
+crosses above the slow moving average.
+
+We now have a complete strategy. But before we decide we want to use it, we
+should try to evaluate the quality of the strategy first. The usual means for
+doing so is **backtesting**, which is looking at how profitable the strategy
+is on historical data. For example, looking at the above chart's performance
+on Apple stock, if the 20-day moving average is the fast moving average and
+the 50-day moving average the slow, this strategy does not appear to be very
+profitable, at least not if you are always taking long positions.
+
+Let's see if we can automate the backtesting task. We first identify when the
+20-day average is below the 50-day average, and vice versa.
+
+[code]
+
+ apple['20d-50d'] = apple['20d'] - apple['50d']
+ apple.tail()
+
+[/code]
+
+| Open | High | Low | Close | Volume | Adj Close | 20d | 50d | 200d | 20d-50d
+---|---|---|---|---|---|---|---|---|---|---
+Date | | | | | | | | | |
+2016-08-26 | 107.410004 | 107.949997 | 106.309998 | 106.940002 | 27766300 |
+106.940002 | 107.87 | 101.51 | 102.73 | 6.36
+2016-08-29 | 106.620003 | 107.440002 | 106.290001 | 106.820000 | 24970300 |
+106.820000 | 107.91 | 101.74 | 102.68 | 6.17
+2016-08-30 | 105.800003 | 106.500000 | 105.500000 | 106.000000 | 24863900 |
+106.000000 | 107.98 | 101.96 | 102.63 | 6.02
+2016-08-31 | 105.660004 | 106.570000 | 105.639999 | 106.099998 | 29662400 |
+106.099998 | 108.00 | 102.16 | 102.60 | 5.84
+2016-09-01 | 106.139999 | 106.800003 | 105.620003 | 106.730003 | 26643600 |
+106.730003 | 108.04 | 102.39 | 102.56 | 5.65
+
+We will refer to the sign of this difference as the **regime**; that is, if
+the fast moving average is above the slow moving average, this is a bullish
+regime (the bulls rule), and a bearish regime (the bears rule) holds when the
+fast moving average is below the slow moving average. I identify regimes with
+the following code.
+
+[code]
+
+ # np.where() is a vectorized if-else function, where a condition is checked for each component of a vector, and the first argument passed is used when the condition holds, and the other passed if it does not
+ apple["Regime"] = np.where(apple['20d-50d'] > 0, 1, 0)
+ # We have 1's for bullish regimes and 0's for everything else. Below I replace bearish regimes's values with -1, and to maintain the rest of the vector, the second argument is apple["Regime"]
+ apple["Regime"] = np.where(apple['20d-50d'] < 0, -1, apple["Regime"])
+ apple.loc['2016-01-01':'2016-08-07',"Regime"].plot(ylim = (-2,2)).axhline(y = 0, color = "black", lw = 2)
+
+[/code]
+
+
+
+[code]
+
+ apple["Regime"].plot(ylim = (-2,2)).axhline(y = 0, color = "black", lw = 2)
+
+[/code]
+
+
+
+[code]
+
+ apple["Regime"].value_counts()
+
+[/code]
+
+[code]
+
+ 1 966
+ -1 663
+ 0 50
+ Name: Regime, dtype: int64
+
+[/code]
+
+The last line above indicates that for 1005 days the market was bearish on
+Apple, while for 600 days the market was bullish, and it was neutral for 54
+days.
+
+Trading signals appear at regime changes. When a bullish regime begins, a buy
+signal is triggered, and when it ends, a sell signal is triggered. Likewise,
+when a bearish regime begins, a sell signal is triggered, and when the regime
+ends, a buy signal is triggered (this is of interest only if you ever will
+short the stock, or use some derivative like a stock option to bet against the
+market).
+
+It's simple to obtain signals. Let
+ indicate
+the regime at time
+, and
+ the
+signal at time
+. Then:
+
+
+
+, with
+ indicating
+"sell", 
+indicating "buy", and
+ no action.
+We can obtain signals like so:
+
+[code]
+
+ # To ensure that all trades close out, I temporarily change the regime of the last row to 0
+ regime_orig = apple.ix[-1, "Regime"]
+ apple.ix[-1, "Regime"] = 0
+ apple["Signal"] = np.sign(apple["Regime"] - apple["Regime"].shift(1))
+ # Restore original regime data
+ apple.ix[-1, "Regime"] = regime_orig
+ apple.tail()
+
+[/code]
+
+| Open | High | Low | Close | Volume | Adj Close | 20d | 50d | 200d | 20d-50d
+| Regime | Signal
+---|---|---|---|---|---|---|---|---|---|---|---|---
+Date | | | | | | | | | | | |
+2016-08-26 | 107.410004 | 107.949997 | 106.309998 | 106.940002 | 27766300 |
+106.940002 | 107.87 | 101.51 | 102.73 | 6.36 | 1.0 | 0.0
+2016-08-29 | 106.620003 | 107.440002 | 106.290001 | 106.820000 | 24970300 |
+106.820000 | 107.91 | 101.74 | 102.68 | 6.17 | 1.0 | 0.0
+2016-08-30 | 105.800003 | 106.500000 | 105.500000 | 106.000000 | 24863900 |
+106.000000 | 107.98 | 101.96 | 102.63 | 6.02 | 1.0 | 0.0
+2016-08-31 | 105.660004 | 106.570000 | 105.639999 | 106.099998 | 29662400 |
+106.099998 | 108.00 | 102.16 | 102.60 | 5.84 | 1.0 | 0.0
+2016-09-01 | 106.139999 | 106.800003 | 105.620003 | 106.730003 | 26643600 |
+106.730003 | 108.04 | 102.39 | 102.56 | 5.65 | 1.0 | -1.0
+
+[code]
+
+ apple["Signal"].plot(ylim = (-2, 2))
+
+[/code]
+
+
+
+[code]
+
+ apple["Signal"].value_counts()
+
+[/code]
+
+[code]
+
+ 0.0 1637
+ -1.0 21
+ 1.0 20
+ Name: Signal, dtype: int64
+
+[/code]
+
+We would buy Apple stock 23 times and sell Apple stock 23 times. If we only go
+long on Apple stock, only 23 trades will be engaged in over the 6-year period,
+while if we pivot from a long to a short position every time a long position
+is terminated, we would engage in 23 trades total. (Bear in mind that trading
+more frequently isn't necessarily good; trades are never free.)
+
+You may notice that the system as it currently stands isn't very robust, since
+even a fleeting moment when the fast moving average is above the slow moving
+average triggers a trade, resulting in trades that end immediately (which is
+bad if not simply because realistically every trade is accompanied by a fee
+that can quickly erode earnings). Additionally, every bullish regime
+immediately transitions into a bearish regime, and if you were constructing
+trading systems that allow both bullish and bearish bets, this would lead to
+the end of one trade immediately triggering a new trade that bets on the
+market in the opposite direction, which again seems finnicky. A better system
+would require more evidence that the market is moving in some particular
+direction. But we will not concern ourselves with these details for now.
+
+Let's now try to identify what the prices of the stock is at every buy and
+every sell.
+
+[code]
+
+ apple.loc[apple["Signal"] == 1, "Close"]
+
+[/code]
+
+[code]
+
+ Date
+ 2010-03-16 224.449997
+ 2010-06-18 274.070011
+ 2010-09-20 283.230007
+ 2011-05-12 346.569988
+ 2011-07-14 357.770004
+ 2011-12-28 402.640003
+ 2012-06-25 570.770020
+ 2013-05-17 433.260010
+ 2013-07-31 452.529984
+ 2013-10-16 501.110001
+ 2014-03-26 539.779991
+ 2014-04-25 571.939980
+ 2014-08-18 99.160004
+ 2014-10-28 106.739998
+ 2015-02-05 119.940002
+ 2015-04-28 130.559998
+ 2015-10-27 114.550003
+ 2016-03-11 102.260002
+ 2016-07-01 95.889999
+ 2016-07-25 97.339996
+ Name: Close, dtype: float64
+
+[/code]
+
+[code]
+
+ apple.loc[apple["Signal"] == -1, "Close"]
+
+[/code]
+
+[code]
+
+ Date
+ 2010-06-11 253.509995
+ 2010-07-22 259.020000
+ 2011-03-30 348.630009
+ 2011-03-31 348.510006
+ 2011-05-27 337.409992
+ 2011-11-17 377.410000
+ 2012-05-09 569.180023
+ 2012-10-17 644.610001
+ 2013-06-26 398.069992
+ 2013-10-03 483.409996
+ 2014-01-28 506.499977
+ 2014-04-22 531.700020
+ 2014-06-11 93.860001
+ 2014-10-17 97.669998
+ 2015-01-05 106.250000
+ 2015-04-16 126.169998
+ 2015-06-25 127.500000
+ 2015-12-18 106.029999
+ 2016-05-05 93.239998
+ 2016-07-08 96.680000
+ 2016-09-01 106.730003
+ Name: Close, dtype: float64
+
+[/code]
+
+[code]
+
+ # Create a DataFrame with trades, including the price at the trade and the regime under which the trade is made.
+ apple_signals = pd.concat([
+ pd.DataFrame({"Price": apple.loc[apple["Signal"] == 1, "Close"],
+ "Regime": apple.loc[apple["Signal"] == 1, "Regime"],
+ "Signal": "Buy"}),
+ pd.DataFrame({"Price": apple.loc[apple["Signal"] == -1, "Close"],
+ "Regime": apple.loc[apple["Signal"] == -1, "Regime"],
+ "Signal": "Sell"}),
+ ])
+ apple_signals.sort_index(inplace = True)
+ apple_signals
+
+[/code]
+
+| Price | Regime | Signal
+---|---|---|---
+Date | | |
+2010-03-16 | 224.449997 | 1.0 | Buy
+2010-06-11 | 253.509995 | -1.0 | Sell
+2010-06-18 | 274.070011 | 1.0 | Buy
+2010-07-22 | 259.020000 | -1.0 | Sell
+2010-09-20 | 283.230007 | 1.0 | Buy
+2011-03-30 | 348.630009 | 0.0 | Sell
+2011-03-31 | 348.510006 | -1.0 | Sell
+2011-05-12 | 346.569988 | 1.0 | Buy
+2011-05-27 | 337.409992 | -1.0 | Sell
+2011-07-14 | 357.770004 | 1.0 | Buy
+2011-11-17 | 377.410000 | -1.0 | Sell
+2011-12-28 | 402.640003 | 1.0 | Buy
+2012-05-09 | 569.180023 | -1.0 | Sell
+2012-06-25 | 570.770020 | 1.0 | Buy
+2012-10-17 | 644.610001 | -1.0 | Sell
+2013-05-17 | 433.260010 | 1.0 | Buy
+2013-06-26 | 398.069992 | -1.0 | Sell
+2013-07-31 | 452.529984 | 1.0 | Buy
+2013-10-03 | 483.409996 | -1.0 | Sell
+2013-10-16 | 501.110001 | 1.0 | Buy
+2014-01-28 | 506.499977 | -1.0 | Sell
+2014-03-26 | 539.779991 | 1.0 | Buy
+2014-04-22 | 531.700020 | -1.0 | Sell
+2014-04-25 | 571.939980 | 1.0 | Buy
+2014-06-11 | 93.860001 | -1.0 | Sell
+2014-08-18 | 99.160004 | 1.0 | Buy
+2014-10-17 | 97.669998 | -1.0 | Sell
+2014-10-28 | 106.739998 | 1.0 | Buy
+2015-01-05 | 106.250000 | -1.0 | Sell
+2015-02-05 | 119.940002 | 1.0 | Buy
+2015-04-16 | 126.169998 | -1.0 | Sell
+2015-04-28 | 130.559998 | 1.0 | Buy
+2015-06-25 | 127.500000 | -1.0 | Sell
+2015-10-27 | 114.550003 | 1.0 | Buy
+2015-12-18 | 106.029999 | -1.0 | Sell
+2016-03-11 | 102.260002 | 1.0 | Buy
+2016-05-05 | 93.239998 | -1.0 | Sell
+2016-07-01 | 95.889999 | 1.0 | Buy
+2016-07-08 | 96.680000 | -1.0 | Sell
+2016-07-25 | 97.339996 | 1.0 | Buy
+2016-09-01 | 106.730003 | 1.0 | Sell
+
+[code]
+
+ # Let's see the profitability of long trades
+ apple_long_profits = pd.DataFrame({
+ "Price": apple_signals.loc[(apple_signals["Signal"] == "Buy") &
+ apple_signals["Regime"] == 1, "Price"],
+ "Profit": pd.Series(apple_signals["Price"] - apple_signals["Price"].shift(1)).loc[
+ apple_signals.loc[(apple_signals["Signal"].shift(1) == "Buy") & (apple_signals["Regime"].shift(1) == 1)].index
+ ].tolist(),
+ "End Date": apple_signals["Price"].loc[
+ apple_signals.loc[(apple_signals["Signal"].shift(1) == "Buy") & (apple_signals["Regime"].shift(1) == 1)].index
+ ].index
+ })
+ apple_long_profits
+
+[/code]
+
+| End Date | Price | Profit
+---|---|---|---
+Date | | |
+2010-03-16 | 2010-06-11 | 224.449997 | 29.059998
+2010-06-18 | 2010-07-22 | 274.070011 | -15.050011
+2010-09-20 | 2011-03-30 | 283.230007 | 65.400002
+2011-05-12 | 2011-05-27 | 346.569988 | -9.159996
+2011-07-14 | 2011-11-17 | 357.770004 | 19.639996
+2011-12-28 | 2012-05-09 | 402.640003 | 166.540020
+2012-06-25 | 2012-10-17 | 570.770020 | 73.839981
+2013-05-17 | 2013-06-26 | 433.260010 | -35.190018
+2013-07-31 | 2013-10-03 | 452.529984 | 30.880012
+2013-10-16 | 2014-01-28 | 501.110001 | 5.389976
+2014-03-26 | 2014-04-22 | 539.779991 | -8.079971
+2014-04-25 | 2014-06-11 | 571.939980 | -478.079979
+2014-08-18 | 2014-10-17 | 99.160004 | -1.490006
+2014-10-28 | 2015-01-05 | 106.739998 | -0.489998
+2015-02-05 | 2015-04-16 | 119.940002 | 6.229996
+2015-04-28 | 2015-06-25 | 130.559998 | -3.059998
+2015-10-27 | 2015-12-18 | 114.550003 | -8.520004
+2016-03-11 | 2016-05-05 | 102.260002 | -9.020004
+2016-07-01 | 2016-07-08 | 95.889999 | 0.790001
+2016-07-25 | 2016-09-01 | 97.339996 | 9.390007
+
+Above, we can see that on May 17th, 2013, there was a massive drop in the
+price of Apple stock, and it looks like our trading system would do badly. But
+this price drop is not because of a massive shock to Apple, but simply due to
+a stock split. And while dividend payments are not as obvious as a stock
+split, they may be affecting the performance of our system.
+
+[code]
+
+ # Let's see the result over the whole period for which we have Apple data
+ pandas_candlestick_ohlc(apple, stick = 45, otherseries = ["20d", "50d", "200d"])
+
+[/code]
+
+
+
+We don't want our trading system to be behaving poorly because of stock splits
+and dividend payments. How should we handle this? One approach would be to
+obtain historical stock split and dividend payment data and design a trading
+system for handling these. This would most realistically represent the
+behavior of the stock and could be considered the best solution, but it is
+more complicated. Another solution would be to adjust the prices to account
+for stock splits and dividend payments.
+
+Yahoo! Finance only provides the adjusted closing price of a stock, but this
+is all we need to get adjusted opening, high, and low prices. The adjusted
+close is computed like so:
+
+
+
+where  is
+the multiplier used for the adjustment. Solving for
+ requires
+only division and thus we can use the closing price and the adjusted closing
+price to adjust all prices in the series.
+
+Let's go back, adjust the apple data, and reevaluate our trading system using
+the adjusted data.
+
+[code]
+
+ def ohlc_adj(dat):
+ """
+ :param dat: pandas DataFrame with stock data, including "Open", "High", "Low", "Close", and "Adj Close", with "Adj Close" containing adjusted closing prices
+
+ :return: pandas DataFrame with adjusted stock data
+
+ This function adjusts stock data for splits, dividends, etc., returning a data frame with
+ "Open", "High", "Low" and "Close" columns. The input DataFrame is similar to that returned
+ by pandas Yahoo! Finance API.
+ """
+ return pd.DataFrame({"Open": dat["Open"] * dat["Adj Close"] / dat["Close"],
+ "High": dat["High"] * dat["Adj Close"] / dat["Close"],
+ "Low": dat["Low"] * dat["Adj Close"] / dat["Close"],
+ "Close": dat["Adj Close"]})
+
+ apple_adj = ohlc_adj(apple)
+
+ # This next code repeats all the earlier analysis we did on the adjusted data
+
+ apple_adj["20d"] = np.round(apple_adj["Close"].rolling(window = 20, center = False).mean(), 2)
+ apple_adj["50d"] = np.round(apple_adj["Close"].rolling(window = 50, center = False).mean(), 2)
+ apple_adj["200d"] = np.round(apple_adj["Close"].rolling(window = 200, center = False).mean(), 2)
+
+ apple_adj['20d-50d'] = apple_adj['20d'] - apple_adj['50d']
+ # np.where() is a vectorized if-else function, where a condition is checked for each component of a vector, and the first argument passed is used when the condition holds, and the other passed if it does not
+ apple_adj["Regime"] = np.where(apple_adj['20d-50d'] > 0, 1, 0)
+ # We have 1's for bullish regimes and 0's for everything else. Below I replace bearish regimes's values with -1, and to maintain the rest of the vector, the second argument is apple["Regime"]
+ apple_adj["Regime"] = np.where(apple_adj['20d-50d'] < 0, -1, apple_adj["Regime"])
+ # To ensure that all trades close out, I temporarily change the regime of the last row to 0
+ regime_orig = apple_adj.ix[-1, "Regime"]
+ apple_adj.ix[-1, "Regime"] = 0
+ apple_adj["Signal"] = np.sign(apple_adj["Regime"] - apple_adj["Regime"].shift(1))
+ # Restore original regime data
+ apple_adj.ix[-1, "Regime"] = regime_orig
+
+ # Create a DataFrame with trades, including the price at the trade and the regime under which the trade is made.
+ apple_adj_signals = pd.concat([
+ pd.DataFrame({"Price": apple_adj.loc[apple_adj["Signal"] == 1, "Close"],
+ "Regime": apple_adj.loc[apple_adj["Signal"] == 1, "Regime"],
+ "Signal": "Buy"}),
+ pd.DataFrame({"Price": apple_adj.loc[apple_adj["Signal"] == -1, "Close"],
+ "Regime": apple_adj.loc[apple_adj["Signal"] == -1, "Regime"],
+ "Signal": "Sell"}),
+ ])
+ apple_adj_signals.sort_index(inplace = True)
+ apple_adj_long_profits = pd.DataFrame({
+ "Price": apple_adj_signals.loc[(apple_adj_signals["Signal"] == "Buy") &
+ apple_adj_signals["Regime"] == 1, "Price"],
+ "Profit": pd.Series(apple_adj_signals["Price"] - apple_adj_signals["Price"].shift(1)).loc[
+ apple_adj_signals.loc[(apple_adj_signals["Signal"].shift(1) == "Buy") & (apple_adj_signals["Regime"].shift(1) == 1)].index
+ ].tolist(),
+ "End Date": apple_adj_signals["Price"].loc[
+ apple_adj_signals.loc[(apple_adj_signals["Signal"].shift(1) == "Buy") & (apple_adj_signals["Regime"].shift(1) == 1)].index
+ ].index
+ })
+
+ pandas_candlestick_ohlc(apple_adj, stick = 45, otherseries = ["20d", "50d", "200d"])
+
+[/code]
+
+
+
+[code]
+
+ apple_adj_long_profits
+
+[/code]
+
+| End Date | Price | Profit
+---|---|---|---
+Date | | |
+2010-03-16 | 2010-06-10 | 29.355667 | 3.408371
+2010-06-18 | 2010-07-22 | 35.845436 | -1.968381
+2010-09-20 | 2011-03-30 | 37.043466 | 8.553623
+2011-05-12 | 2011-05-27 | 45.327660 | -1.198030
+2011-07-14 | 2011-11-17 | 46.792503 | 2.568702
+2011-12-28 | 2012-05-09 | 52.661020 | 21.781659
+2012-06-25 | 2012-10-17 | 74.650634 | 10.019459
+2013-05-17 | 2013-06-26 | 57.882798 | -4.701326
+2013-07-31 | 2013-10-04 | 60.457234 | 4.500835
+2013-10-16 | 2014-01-28 | 67.389473 | 1.122523
+2014-03-11 | 2014-03-17 | 72.948554 | -1.272298
+2014-03-24 | 2014-04-22 | 73.370393 | -1.019203
+2014-04-25 | 2014-10-17 | 77.826851 | 16.191371
+2014-10-28 | 2015-01-05 | 102.749105 | -0.028185
+2015-02-05 | 2015-04-16 | 116.413846 | 6.046838
+2015-04-28 | 2015-06-26 | 126.721620 | -3.184117
+2015-10-27 | 2015-12-18 | 112.152083 | -7.897288
+2016-03-10 | 2016-05-05 | 100.015950 | -7.278331
+2016-06-23 | 2016-06-27 | 95.582210 | -4.038123
+2016-06-30 | 2016-07-11 | 95.084904 | 1.372569
+2016-07-25 | 2016-09-01 | 96.815526 | 9.914477
+
+As you can see, adjusting for dividends and stock splits makes a big
+difference. We will use this data from now on.
+
+Let's now create a simulated portfolio of $1,000,000, and see how it would
+behave, according to the rules we have established. This includes:
+
+ * Investing only 10% of the portfolio in any trade
+ * Exiting the position if losses exceed 20% of the value of the trade.
+
+When simulating, bear in mind that:
+
+ * Trades are done in batches of 100 stocks.
+ * Our stop-loss rule involves placing an order to sell the stock the moment the price drops below the specified level. Thus we need to check whether the lows during this period ever go low enough to trigger the stop-loss. Realistically, unless we buy a put option, we cannot guarantee that we will sell the stock at the price we set at the stop-loss, but we will use this as the selling price anyway for the sake of simplicity.
+ * Every trade is accompanied by a commission to the broker, which should be accounted for. I do not do so here.
+
+Here's how a backtest may look:
+
+[code]
+
+ # We need to get the low of the price during each trade.
+ tradeperiods = pd.DataFrame({"Start": apple_adj_long_profits.index,
+ "End": apple_adj_long_profits["End Date"]})
+ apple_adj_long_profits["Low"] = tradeperiods.apply(lambda x: min(apple_adj.loc[x["Start"]:x["End"], "Low"]), axis = 1)
+ apple_adj_long_profits
+
+[/code]
+
+| End Date | Price | Profit | Low
+---|---|---|---|---
+Date | | | |
+2010-03-16 | 2010-06-10 | 29.355667 | 3.408371 | 26.059775
+2010-06-18 | 2010-07-22 | 35.845436 | -1.968381 | 31.337127
+2010-09-20 | 2011-03-30 | 37.043466 | 8.553623 | 35.967068
+2011-05-12 | 2011-05-27 | 45.327660 | -1.198030 | 43.084626
+2011-07-14 | 2011-11-17 | 46.792503 | 2.568702 | 46.171251
+2011-12-28 | 2012-05-09 | 52.661020 | 21.781659 | 52.382438
+2012-06-25 | 2012-10-17 | 74.650634 | 10.019459 | 73.975759
+2013-05-17 | 2013-06-26 | 57.882798 | -4.701326 | 52.859502
+2013-07-31 | 2013-10-04 | 60.457234 | 4.500835 | 60.043080
+2013-10-16 | 2014-01-28 | 67.389473 | 1.122523 | 67.136651
+2014-03-11 | 2014-03-17 | 72.948554 | -1.272298 | 71.167335
+2014-03-24 | 2014-04-22 | 73.370393 | -1.019203 | 69.579335
+2014-04-25 | 2014-10-17 | 77.826851 | 16.191371 | 76.740971
+2014-10-28 | 2015-01-05 | 102.749105 | -0.028185 | 101.411076
+2015-02-05 | 2015-04-16 | 116.413846 | 6.046838 | 114.948237
+2015-04-28 | 2015-06-26 | 126.721620 | -3.184117 | 119.733299
+2015-10-27 | 2015-12-18 | 112.152083 | -7.897288 | 104.038477
+2016-03-10 | 2016-05-05 | 100.015950 | -7.278331 | 91.345994
+2016-06-23 | 2016-06-27 | 95.582210 | -4.038123 | 91.006996
+2016-06-30 | 2016-07-11 | 95.084904 | 1.372569 | 93.791913
+2016-07-25 | 2016-09-01 | 96.815526 | 9.914477 | 95.900485
+
+[code]
+
+ # Now we have all the information needed to simulate this strategy in apple_adj_long_profits
+ cash = 1000000
+ apple_backtest = pd.DataFrame({"Start Port. Value": [],
+ "End Port. Value": [],
+ "End Date": [],
+ "Shares": [],
+ "Share Price": [],
+ "Trade Value": [],
+ "Profit per Share": [],
+ "Total Profit": [],
+ "Stop-Loss Triggered": []})
+ port_value = .1 # Max proportion of portfolio bet on any trade
+ batch = 100 # Number of shares bought per batch
+ stoploss = .2 # % of trade loss that would trigger a stoploss
+ for index, row in apple_adj_long_profits.iterrows():
+ batches = np.floor(cash * port_value) // np.ceil(batch * row["Price"]) # Maximum number of batches of stocks invested in
+ trade_val = batches * batch * row["Price"] # How much money is put on the line with each trade
+ if row["Low"] < (1 - stoploss) * row["Price"]: # Account for the stop-loss
+ share_profit = np.round((1 - stoploss) * row["Price"], 2)
+ stop_trig = True
+ else:
+ share_profit = row["Profit"]
+ stop_trig = False
+ profit = share_profit * batches * batch # Compute profits
+ # Add a row to the backtest data frame containing the results of the trade
+ apple_backtest = apple_backtest.append(pd.DataFrame({
+ "Start Port. Value": cash,
+ "End Port. Value": cash + profit,
+ "End Date": row["End Date"],
+ "Shares": batch * batches,
+ "Share Price": row["Price"],
+ "Trade Value": trade_val,
+ "Profit per Share": share_profit,
+ "Total Profit": profit,
+ "Stop-Loss Triggered": stop_trig
+ }, index = [index]))
+ cash = max(0, cash + profit)
+
+ apple_backtest
+
+[/code]
+
+| End Date | End Port. Value | Profit per Share | Share Price | Shares | Start
+Port. Value | Stop-Loss Triggered | Total Profit | Trade Value
+---|---|---|---|---|---|---|---|---|---
+2010-03-16 | 2010-06-10 | 1.011588e+06 | 3.408371 | 29.355667 | 3400.0 |
+1.000000e+06 | 0.0 | 11588.4614 | 99809.2678
+2010-06-18 | 2010-07-22 | 1.006077e+06 | -1.968381 | 35.845436 | 2800.0 |
+1.011588e+06 | 0.0 | -5511.4668 | 100367.2208
+2010-09-20 | 2011-03-30 | 1.029172e+06 | 8.553623 | 37.043466 | 2700.0 |
+1.006077e+06 | 0.0 | 23094.7821 | 100017.3582
+2011-05-12 | 2011-05-27 | 1.026536e+06 | -1.198030 | 45.327660 | 2200.0 |
+1.029172e+06 | 0.0 | -2635.6660 | 99720.8520
+2011-07-14 | 2011-11-17 | 1.031930e+06 | 2.568702 | 46.792503 | 2100.0 |
+1.026536e+06 | 0.0 | 5394.2742 | 98264.2563
+2011-12-28 | 2012-05-09 | 1.073316e+06 | 21.781659 | 52.661020 | 1900.0 |
+1.031930e+06 | 0.0 | 41385.1521 | 100055.9380
+2012-06-25 | 2012-10-17 | 1.087343e+06 | 10.019459 | 74.650634 | 1400.0 |
+1.073316e+06 | 0.0 | 14027.2426 | 104510.8876
+2013-05-17 | 2013-06-26 | 1.078880e+06 | -4.701326 | 57.882798 | 1800.0 |
+1.087343e+06 | 0.0 | -8462.3868 | 104189.0364
+2013-07-31 | 2013-10-04 | 1.086532e+06 | 4.500835 | 60.457234 | 1700.0 |
+1.078880e+06 | 0.0 | 7651.4195 | 102777.2978
+2013-10-16 | 2014-01-28 | 1.088328e+06 | 1.122523 | 67.389473 | 1600.0 |
+1.086532e+06 | 0.0 | 1796.0368 | 107823.1568
+2014-03-11 | 2014-03-17 | 1.086547e+06 | -1.272298 | 72.948554 | 1400.0 |
+1.088328e+06 | 0.0 | -1781.2172 | 102127.9756
+2014-03-24 | 2014-04-22 | 1.085120e+06 | -1.019203 | 73.370393 | 1400.0 |
+1.086547e+06 | 0.0 | -1426.8842 | 102718.5502
+2014-04-25 | 2014-10-17 | 1.106169e+06 | 16.191371 | 77.826851 | 1300.0 |
+1.085120e+06 | 0.0 | 21048.7823 | 101174.9063
+2014-10-28 | 2015-01-05 | 1.106140e+06 | -0.028185 | 102.749105 | 1000.0 |
+1.106169e+06 | 0.0 | -28.1850 | 102749.1050
+2015-02-05 | 2015-04-16 | 1.111582e+06 | 6.046838 | 116.413846 | 900.0 |
+1.106140e+06 | 0.0 | 5442.1542 | 104772.4614
+2015-04-28 | 2015-06-26 | 1.109035e+06 | -3.184117 | 126.721620 | 800.0 |
+1.111582e+06 | 0.0 | -2547.2936 | 101377.2960
+2015-10-27 | 2015-12-18 | 1.101928e+06 | -7.897288 | 112.152083 | 900.0 |
+1.109035e+06 | 0.0 | -7107.5592 | 100936.8747
+2016-03-10 | 2016-05-05 | 1.093921e+06 | -7.278331 | 100.015950 | 1100.0 |
+1.101928e+06 | 0.0 | -8006.1641 | 110017.5450
+2016-06-23 | 2016-06-27 | 1.089480e+06 | -4.038123 | 95.582210 | 1100.0 |
+1.093921e+06 | 0.0 | -4441.9353 | 105140.4310
+2016-06-30 | 2016-07-11 | 1.090989e+06 | 1.372569 | 95.084904 | 1100.0 |
+1.089480e+06 | 0.0 | 1509.8259 | 104593.3944
+2016-07-25 | 2016-09-01 | 1.101895e+06 | 9.914477 | 96.815526 | 1100.0 |
+1.090989e+06 | 0.0 | 10905.9247 | 106497.0786
+
+[code]
+
+ apple_backtest["End Port. Value"].plot()
+
+[/code]
+
+
+
+Our portfolio's value grew by 10% in about six years. Considering that only
+10% of the portfolio was ever involved in any single trade, this is not bad
+performance.
+
+Notice that this strategy never lead to our stop-loss order being triggered.
+Does this mean we don't need stop-loss orders? There is no simple answer to
+this. After all, if we had chosen a different level at which a stop-loss would
+be triggered, we may have seen it triggered.
+
+Stop-loss orders are automatically triggered and ask no question as to why the
+order was triggered. This means that both a genuine change in trend or a
+momentary fluctuation can trigger a stop-loss, with the latter being the more
+concerning reason since not only do you have to pay for the order, there is no
+guarantee that you will sell the stock at the price you set, which could make
+your losses worse. Meanwhile, the trend on which you based your trade still
+holds, and had the stop-loss not been triggered, you may have made a profit.
+That said, a stop-loss can help you protect against your own emotions, staying
+wedded to a trade even though it has lost its value. They're also good to have
+if you cannot monitor or quickly access your portfolio, like when you are on
+vacation.
+
+I have provided links both
+[for](http://www.investopedia.com/articles/stocks/09/use-stop-loss.asp) and
+["against"](http://www.marketwatch.com/story/why-i-stopped-using-stop-loss-
+orders-2013-05-09) the use of stop-loss orders, but from now on I'm not going
+to require our backtesting system to account for them. While less realistic
+(and I do believe an industrial-strength system should account for a stop-loss
+rule), this simplifies the backtesting task.
+
+A more realistic portfolio would not be betting 10% of its value on only one
+stock. A more realistic one would consider investing in multiple stocks.
+Multiple trades may be ongoing at any given time involving multiple companies,
+and most of the portfolio will be in stocks, not cash. Now that we will be
+investing in multiple stops and exiting only when moving averages cross (not
+because of a stop-loss), we will need to change our approach to backtesting.
+For example, we will be using one **pandas** `DataFrame` to contain all buy
+and sell orders for all stocks being considered, and our loop above will have
+to track more information.
+
+I have written functions for creating order data for multiple stocks, and a
+function for performing the backtesting.
+
+[code]
+
+ def ma_crossover_orders(stocks, fast, slow):
+ """
+ :param stocks: A list of tuples, the first argument in each tuple being a string containing the ticker symbol of each stock (or however you want the stock represented, so long as it's unique), and the second being a pandas DataFrame containing the stocks, with a "Close" column and indexing by date (like the data frames returned by the Yahoo! Finance API)
+ :param fast: Integer for the number of days used in the fast moving average
+ :param slow: Integer for the number of days used in the slow moving average
+
+ :return: pandas DataFrame containing stock orders
+
+ This function takes a list of stocks and determines when each stock would be bought or sold depending on a moving average crossover strategy, returning a data frame with information about when the stocks in the portfolio are bought or sold according to the strategy
+ """
+ fast_str = str(fast) + 'd'
+ slow_str = str(slow) + 'd'
+ ma_diff_str = fast_str + '-' + slow_str
+
+ trades = pd.DataFrame({"Price": [], "Regime": [], "Signal": []})
+ for s in stocks:
+ # Get the moving averages, both fast and slow, along with the difference in the moving averages
+ s[1][fast_str] = np.round(s[1]["Close"].rolling(window = fast, center = False).mean(), 2)
+ s[1][slow_str] = np.round(s[1]["Close"].rolling(window = slow, center = False).mean(), 2)
+ s[1][ma_diff_str] = s[1][fast_str] - s[1][slow_str]
+
+ # np.where() is a vectorized if-else function, where a condition is checked for each component of a vector, and the first argument passed is used when the condition holds, and the other passed if it does not
+ s[1]["Regime"] = np.where(s[1][ma_diff_str] > 0, 1, 0)
+ # We have 1's for bullish regimes and 0's for everything else. Below I replace bearish regimes's values with -1, and to maintain the rest of the vector, the second argument is apple["Regime"]
+ s[1]["Regime"] = np.where(s[1][ma_diff_str] < 0, -1, s[1]["Regime"])
+ # To ensure that all trades close out, I temporarily change the regime of the last row to 0
+ regime_orig = s[1].ix[-1, "Regime"]
+ s[1].ix[-1, "Regime"] = 0
+ s[1]["Signal"] = np.sign(s[1]["Regime"] - s[1]["Regime"].shift(1))
+ # Restore original regime data
+ s[1].ix[-1, "Regime"] = regime_orig
+
+ # Get signals
+ signals = pd.concat([
+ pd.DataFrame({"Price": s[1].loc[s[1]["Signal"] == 1, "Close"],
+ "Regime": s[1].loc[s[1]["Signal"] == 1, "Regime"],
+ "Signal": "Buy"}),
+ pd.DataFrame({"Price": s[1].loc[s[1]["Signal"] == -1, "Close"],
+ "Regime": s[1].loc[s[1]["Signal"] == -1, "Regime"],
+ "Signal": "Sell"}),
+ ])
+ signals.index = pd.MultiIndex.from_product([signals.index, [s[0]]], names = ["Date", "Symbol"])
+ trades = trades.append(signals)
+
+ trades.sort_index(inplace = True)
+ trades.index = pd.MultiIndex.from_tuples(trades.index, names = ["Date", "Symbol"])
+
+ return trades
+
+
+ def backtest(signals, cash, port_value = .1, batch = 100):
+ """
+ :param signals: pandas DataFrame containing buy and sell signals with stock prices and symbols, like that returned by ma_crossover_orders
+ :param cash: integer for starting cash value
+ :param port_value: maximum proportion of portfolio to risk on any single trade
+ :param batch: Trading batch sizes
+
+ :return: pandas DataFrame with backtesting results
+
+ This function backtests strategies, with the signals generated by the strategies being passed in the signals DataFrame. A fictitious portfolio is simulated and the returns generated by this portfolio are reported.
+ """
+
+ SYMBOL = 1 # Constant for which element in index represents symbol
+ portfolio = dict() # Will contain how many stocks are in the portfolio for a given symbol
+ port_prices = dict() # Tracks old trade prices for determining profits
+ # Dataframe that will contain backtesting report
+ results = pd.DataFrame({"Start Cash": [],
+ "End Cash": [],
+ "Portfolio Value": [],
+ "Type": [],
+ "Shares": [],
+ "Share Price": [],
+ "Trade Value": [],
+ "Profit per Share": [],
+ "Total Profit": []})
+
+ for index, row in signals.iterrows():
+ # These first few lines are done for any trade
+ shares = portfolio.setdefault(index[SYMBOL], 0)
+ trade_val = 0
+ batches = 0
+ cash_change = row["Price"] * shares # Shares could potentially be a positive or negative number (cash_change will be added in the end; negative shares indicate a short)
+ portfolio[index[SYMBOL]] = 0 # For a given symbol, a position is effectively cleared
+
+ old_price = port_prices.setdefault(index[SYMBOL], row["Price"])
+ portfolio_val = 0
+ for key, val in portfolio.items():
+ portfolio_val += val * port_prices[key]
+
+ if row["Signal"] == "Buy" and row["Regime"] == 1: # Entering a long position
+ batches = np.floor((portfolio_val + cash) * port_value) // np.ceil(batch * row["Price"]) # Maximum number of batches of stocks invested in
+ trade_val = batches * batch * row["Price"] # How much money is put on the line with each trade
+ cash_change -= trade_val # We are buying shares so cash will go down
+ portfolio[index[SYMBOL]] = batches * batch # Recording how many shares are currently invested in the stock
+ port_prices[index[SYMBOL]] = row["Price"] # Record price
+ old_price = row["Price"]
+ elif row["Signal"] == "Sell" and row["Regime"] == -1: # Entering a short
+ pass
+ # Do nothing; can we provide a method for shorting the market?
+ #else:
+ #raise ValueError("I don't know what to do with signal " + row["Signal"])
+
+ pprofit = row["Price"] - old_price # Compute profit per share; old_price is set in such a way that entering a position results in a profit of zero
+
+ # Update report
+ results = results.append(pd.DataFrame({
+ "Start Cash": cash,
+ "End Cash": cash + cash_change,
+ "Portfolio Value": cash + cash_change + portfolio_val + trade_val,
+ "Type": row["Signal"],
+ "Shares": batch * batches,
+ "Share Price": row["Price"],
+ "Trade Value": abs(cash_change),
+ "Profit per Share": pprofit,
+ "Total Profit": batches * batch * pprofit
+ }, index = [index]))
+ cash += cash_change # Final change to cash balance
+
+ results.sort_index(inplace = True)
+ results.index = pd.MultiIndex.from_tuples(results.index, names = ["Date", "Symbol"])
+
+ return results
+
+ # Get more stocks
+ microsoft = web.DataReader("MSFT", "yahoo", start, end)
+ google = web.DataReader("GOOG", "yahoo", start, end)
+ facebook = web.DataReader("FB", "yahoo", start, end)
+ twitter = web.DataReader("TWTR", "yahoo", start, end)
+ netflix = web.DataReader("NFLX", "yahoo", start, end)
+ amazon = web.DataReader("AMZN", "yahoo", start, end)
+ yahoo = web.DataReader("YHOO", "yahoo", start, end)
+ sony = web.DataReader("SNY", "yahoo", start, end)
+ nintendo = web.DataReader("NTDOY", "yahoo", start, end)
+ ibm = web.DataReader("IBM", "yahoo", start, end)
+ hp = web.DataReader("HPQ", "yahoo", start, end)
+
+
+
+[/code]
+
+[code]
+
+ signals = ma_crossover_orders([("AAPL", ohlc_adj(apple)),
+ ("MSFT", ohlc_adj(microsoft)),
+ ("GOOG", ohlc_adj(google)),
+ ("FB", ohlc_adj(facebook)),
+ ("TWTR", ohlc_adj(twitter)),
+ ("NFLX", ohlc_adj(netflix)),
+ ("AMZN", ohlc_adj(amazon)),
+ ("YHOO", ohlc_adj(yahoo)),
+ ("SNY", ohlc_adj(yahoo)),
+ ("NTDOY", ohlc_adj(nintendo)),
+ ("IBM", ohlc_adj(ibm)),
+ ("HPQ", ohlc_adj(hp))],
+ fast = 20, slow = 50)
+ signals
+
+[/code]
+
+| | Price | Regime | Signal
+---|---|---|---|---
+Date | Symbol | | |
+2010-03-16 | AAPL | 29.355667 | 1.0 | Buy
+AMZN | 131.789993 | 1.0 | Buy
+GOOG | 282.318173 | -1.0 | Sell
+HPQ | 20.722316 | 1.0 | Buy
+IBM | 110.563240 | 1.0 | Buy
+MSFT | 24.677580 | -1.0 | Sell
+NFLX | 10.090000 | 1.0 | Buy
+NTDOY | 37.099998 | 1.0 | Buy
+SNY | 16.360001 | -1.0 | Sell
+YHOO | 16.360001 | -1.0 | Sell
+2010-03-17 | SNY | 16.500000 | 1.0 | Buy
+YHOO | 16.500000 | 1.0 | Buy
+2010-03-22 | GOOG | 278.472004 | 1.0 | Buy
+2010-03-23 | MSFT | 25.106096 | 1.0 | Buy
+2010-05-03 | GOOG | 265.035411 | -1.0 | Sell
+2010-05-10 | HPQ | 19.435830 | -1.0 | Sell
+2010-05-14 | NTDOY | 35.799999 | -1.0 | Sell
+2010-05-17 | SNY | 16.270000 | -1.0 | Sell
+YHOO | 16.270000 | -1.0 | Sell
+2010-05-19 | AMZN | 124.589996 | -1.0 | Sell
+MSFT | 23.835187 | -1.0 | Sell
+2010-05-21 | IBM | 108.322991 | -1.0 | Sell
+2010-06-10 | AAPL | 32.764038 | 0.0 | Sell
+2010-06-11 | AAPL | 33.156405 | -1.0 | Sell
+2010-06-18 | AAPL | 35.845436 | 1.0 | Buy
+2010-06-28 | IBM | 111.397697 | 1.0 | Buy
+2010-07-01 | IBM | 105.861499 | -1.0 | Sell
+2010-07-06 | IBM | 106.630175 | 1.0 | Buy
+2010-07-09 | NTDOY | 36.950001 | 1.0 | Buy
+2010-07-20 | IBM | 109.298956 | -1.0 | Sell
+… | … | … | … | …
+2016-06-23 | AAPL | 95.582210 | 1.0 | Buy
+TWTR | 17.040001 | 1.0 | Buy
+2016-06-27 | AAPL | 91.544087 | -1.0 | Sell
+FB | 108.970001 | -1.0 | Sell
+2016-06-28 | SNY | 36.040001 | -1.0 | Sell
+YHOO | 36.040001 | -1.0 | Sell
+2016-06-30 | AAPL | 95.084904 | 1.0 | Buy
+NFLX | 91.480003 | 0.0 | Sell
+2016-07-01 | NFLX | 96.669998 | -1.0 | Sell
+SNY | 37.990002 | 1.0 | Buy
+YHOO | 37.990002 | 1.0 | Buy
+2016-07-11 | AAPL | 96.457473 | -1.0 | Sell
+NTDOY | 27.700001 | 1.0 | Buy
+2016-07-14 | MSFT | 53.407133 | 1.0 | Buy
+2016-07-25 | AAPL | 96.815526 | 1.0 | Buy
+FB | 121.629997 | 1.0 | Buy
+2016-07-26 | GOOG | 738.419983 | 1.0 | Buy
+2016-08-18 | NFLX | 96.160004 | 1.0 | Buy
+2016-09-01 | AAPL | 106.730003 | 1.0 | Sell
+2016-09-02 | AMZN | 772.440002 | 1.0 | Sell
+FB | 126.510002 | 1.0 | Sell
+GOOG | 771.460022 | 1.0 | Sell
+HPQ | 14.490000 | 1.0 | Sell
+IBM | 159.550003 | 1.0 | Sell
+MSFT | 57.669998 | 1.0 | Sell
+NFLX | 97.379997 | 1.0 | Sell
+NTDOY | 28.840000 | 1.0 | Sell
+SNY | 43.279999 | 1.0 | Sell
+TWTR | 19.549999 | 1.0 | Sell
+YHOO | 43.279999 | 1.0 | Sell
+
+475 rows × 3 columns
+
+[code]
+
+ bk = backtest(signals, 1000000)
+ bk
+
+[/code]
+
+| | End Cash | Portfolio Value | Profit per Share | Share Price | Shares |
+Start Cash | Total Profit | Trade Value | Type
+---|---|---|---|---|---|---|---|---|---|---
+Date | Symbol | | | | | | | | |
+2010-03-16 | AAPL | 9.001907e+05 | 1.000000e+06 | 0.000000 | 29.355667 |
+3400.0 | 1.000000e+06 | 0.0 | 99809.2678 | Buy
+AMZN | 8.079377e+05 | 1.000000e+06 | 0.000000 | 131.789993 | 700.0 |
+9.001907e+05 | 0.0 | 92252.9951 | Buy
+GOOG | 8.079377e+05 | 1.000000e+06 | 0.000000 | 282.318173 | 0.0 |
+8.079377e+05 | 0.0 | 0.0000 | Sell
+HPQ | 7.084706e+05 | 1.000000e+06 | 0.000000 | 20.722316 | 4800.0 |
+8.079377e+05 | 0.0 | 99467.1168 | Buy
+IBM | 6.089637e+05 | 1.000000e+06 | 0.000000 | 110.563240 | 900.0 |
+7.084706e+05 | 0.0 | 99506.9160 | Buy
+MSFT | 6.089637e+05 | 1.000000e+06 | 0.000000 | 24.677580 | 0.0 | 6.089637e+05
+| 0.0 | 0.0000 | Sell
+NFLX | 5.090727e+05 | 1.000000e+06 | 0.000000 | 10.090000 | 9900.0 |
+6.089637e+05 | 0.0 | 99891.0000 | Buy
+NTDOY | 4.126127e+05 | 1.000000e+06 | 0.000000 | 37.099998 | 2600.0 |
+5.090727e+05 | 0.0 | 96459.9948 | Buy
+SNY | 4.126127e+05 | 1.000000e+06 | 0.000000 | 16.360001 | 0.0 | 4.126127e+05
+| 0.0 | 0.0000 | Sell
+YHOO | 4.126127e+05 | 1.000000e+06 | 0.000000 | 16.360001 | 0.0 | 4.126127e+05
+| 0.0 | 0.0000 | Sell
+2010-03-17 | SNY | 3.136127e+05 | 1.000000e+06 | 0.000000 | 16.500000 | 6000.0
+| 4.126127e+05 | 0.0 | 99000.0000 | Buy
+YHOO | 2.146127e+05 | 1.000000e+06 | 0.000000 | 16.500000 | 6000.0 |
+3.136127e+05 | 0.0 | 99000.0000 | Buy
+2010-03-22 | GOOG | 1.310711e+05 | 1.000000e+06 | 0.000000 | 278.472004 |
+300.0 | 2.146127e+05 | 0.0 | 83541.6012 | Buy
+2010-03-23 | MSFT | 3.315733e+04 | 1.000000e+06 | 0.000000 | 25.106096 |
+3900.0 | 1.310711e+05 | 0.0 | 97913.7744 | Buy
+2010-05-03 | GOOG | 1.126680e+05 | 9.959690e+05 | -13.436593 | 265.035411 |
+0.0 | 3.315733e+04 | -0.0 | 79510.6233 | Sell
+2010-05-10 | HPQ | 2.059599e+05 | 9.897939e+05 | -1.286486 | 19.435830 | 0.0 |
+1.126680e+05 | -0.0 | 93291.9840 | Sell
+2010-05-14 | NTDOY | 2.990399e+05 | 9.864139e+05 | -1.299999 | 35.799999 | 0.0
+| 2.059599e+05 | -0.0 | 93079.9974 | Sell
+2010-05-17 | SNY | 3.966599e+05 | 9.850339e+05 | -0.230000 | 16.270000 | 0.0 |
+2.990399e+05 | -0.0 | 97620.0000 | Sell
+YHOO | 4.942799e+05 | 9.836539e+05 | -0.230000 | 16.270000 | 0.0 |
+3.966599e+05 | -0.0 | 97620.0000 | Sell
+2010-05-19 | AMZN | 5.814929e+05 | 9.786139e+05 | -7.199997 | 124.589996 | 0.0
+| 4.942799e+05 | -0.0 | 87212.9972 | Sell
+MSFT | 6.744502e+05 | 9.736573e+05 | -1.270909 | 23.835187 | 0.0 |
+5.814929e+05 | -0.0 | 92957.2293 | Sell
+2010-05-21 | IBM | 7.719409e+05 | 9.716411e+05 | -2.240249 | 108.322991 | 0.0
+| 6.744502e+05 | -0.0 | 97490.6919 | Sell
+2010-06-10 | AAPL | 8.833386e+05 | 9.832296e+05 | 3.408371 | 32.764038 | 0.0 |
+7.719409e+05 | 0.0 | 111397.7292 | Sell
+2010-06-11 | AAPL | 8.833386e+05 | 9.832296e+05 | 3.800738 | 33.156405 | 0.0 |
+8.833386e+05 | 0.0 | 0.0000 | Sell
+2010-06-18 | AAPL | 7.865559e+05 | 9.832296e+05 | 0.000000 | 35.845436 |
+2700.0 | 8.833386e+05 | 0.0 | 96782.6772 | Buy
+2010-06-28 | IBM | 6.974378e+05 | 9.832296e+05 | 0.000000 | 111.397697 | 800.0
+| 7.865559e+05 | 0.0 | 89118.1576 | Buy
+2010-07-01 | IBM | 7.821270e+05 | 9.788006e+05 | -5.536198 | 105.861499 | 0.0
+| 6.974378e+05 | -0.0 | 84689.1992 | Sell
+2010-07-06 | IBM | 6.861598e+05 | 9.788006e+05 | 0.000000 | 106.630175 | 900.0
+| 7.821270e+05 | 0.0 | 95967.1575 | Buy
+2010-07-09 | NTDOY | 5.900898e+05 | 9.788006e+05 | 0.000000 | 36.950001 |
+2600.0 | 6.861598e+05 | 0.0 | 96070.0026 | Buy
+2010-07-20 | IBM | 6.884589e+05 | 9.812025e+05 | 2.668781 | 109.298956 | 0.0 |
+5.900898e+05 | 0.0 | 98369.0604 | Sell
+… | … | … | … | … | … | … | … | … | … | …
+2016-06-23 | AAPL | 3.951693e+05 | 1.863808e+06 | 0.000000 | 95.582210 |
+1900.0 | 5.767755e+05 | 0.0 | 181606.1990 | Buy
+TWTR | 2.094333e+05 | 1.863808e+06 | 0.000000 | 17.040001 | 10900.0 |
+3.951693e+05 | 0.0 | 185736.0109 | Buy
+2016-06-27 | AAPL | 3.833670e+05 | 1.856135e+06 | -4.038123 | 91.544087 | 0.0
+| 2.094333e+05 | -0.0 | 173933.7653 | Sell
+FB | 5.795130e+05 | 1.862921e+06 | 3.770004 | 108.970001 | 0.0 | 3.833670e+05
+| 0.0 | 196146.0018 | Sell
+2016-06-28 | SNY | 7.885450e+05 | 1.880959e+06 | 3.110001 | 36.040001 | 0.0 |
+5.795130e+05 | 0.0 | 209032.0058 | Sell
+YHOO | 9.975770e+05 | 1.898997e+06 | 3.110001 | 36.040001 | 0.0 | 7.885450e+05
+| 0.0 | 209032.0058 | Sell
+2016-06-30 | AAPL | 8.169157e+05 | 1.898997e+06 | 0.000000 | 95.084904 |
+1900.0 | 9.975770e+05 | 0.0 | 180661.3176 | Buy
+NFLX | 9.907277e+05 | 1.893981e+06 | -2.640000 | 91.480003 | 0.0 |
+8.169157e+05 | -0.0 | 173812.0057 | Sell
+2016-07-01 | NFLX | 9.907277e+05 | 1.893981e+06 | 2.549995 | 96.669998 | 0.0 |
+9.907277e+05 | 0.0 | 0.0000 | Sell
+SNY | 8.045767e+05 | 1.893981e+06 | 0.000000 | 37.990002 | 4900.0 |
+9.907277e+05 | 0.0 | 186151.0098 | Buy
+YHOO | 6.184257e+05 | 1.893981e+06 | 0.000000 | 37.990002 | 4900.0 |
+8.045767e+05 | 0.0 | 186151.0098 | Buy
+2016-07-11 | AAPL | 8.016949e+05 | 1.896589e+06 | 1.372569 | 96.457473 | 0.0 |
+6.184257e+05 | 0.0 | 183269.1987 | Sell
+NTDOY | 6.133349e+05 | 1.896589e+06 | 0.000000 | 27.700001 | 6800.0 |
+8.016949e+05 | 0.0 | 188360.0068 | Buy
+2016-07-14 | MSFT | 4.264099e+05 | 1.896589e+06 | 0.000000 | 53.407133 |
+3500.0 | 6.133349e+05 | 0.0 | 186924.9655 | Buy
+2016-07-25 | AAPL | 2.424604e+05 | 1.896589e+06 | 0.000000 | 96.815526 |
+1900.0 | 4.264099e+05 | 0.0 | 183949.4994 | Buy
+FB | 6.001543e+04 | 1.896589e+06 | 0.000000 | 121.629997 | 1500.0 |
+2.424604e+05 | 0.0 | 182444.9955 | Buy
+2016-07-26 | GOOG | -8.766857e+04 | 1.896589e+06 | 0.000000 | 738.419983 |
+200.0 | 6.001543e+04 | 0.0 | 147683.9966 | Buy
+2016-08-18 | NFLX | -2.703726e+05 | 1.896589e+06 | 0.000000 | 96.160004 |
+1900.0 | -8.766857e+04 | 0.0 | 182704.0076 | Buy
+2016-09-01 | AAPL | -6.758557e+04 | 1.915427e+06 | 9.914477 | 106.730003 | 0.0
+| -2.703726e+05 | 0.0 | 202787.0057 | Sell
+2016-09-02 | AMZN | 1.641464e+05 | 1.979327e+06 | 213.000000 | 772.440002 |
+0.0 | -6.758557e+04 | 0.0 | 231732.0006 | Sell
+FB | 3.539114e+05 | 1.986647e+06 | 4.880005 | 126.510002 | 0.0 | 1.641464e+05
+| 0.0 | 189765.0030 | Sell
+GOOG | 5.082034e+05 | 1.993255e+06 | 33.040039 | 771.460022 | 0.0 |
+3.539114e+05 | 0.0 | 154292.0044 | Sell
+HPQ | 7.081654e+05 | 2.006030e+06 | 0.925746 | 14.490000 | 0.0 | 5.082034e+05
+| 0.0 | 199962.0000 | Sell
+IBM | 8.996254e+05 | 2.015652e+06 | 8.018727 | 159.550003 | 0.0 | 7.081654e+05
+| 0.0 | 191460.0036 | Sell
+MSFT | 1.101470e+06 | 2.030572e+06 | 4.262865 | 57.669998 | 0.0 | 8.996254e+05
+| 0.0 | 201844.9930 | Sell
+NFLX | 1.286492e+06 | 2.032890e+06 | 1.219993 | 97.379997 | 0.0 | 1.101470e+06
+| 0.0 | 185021.9943 | Sell
+NTDOY | 1.482604e+06 | 2.040642e+06 | 1.139999 | 28.840000 | 0.0 |
+1.286492e+06 | 0.0 | 196112.0000 | Sell
+SNY | 1.694676e+06 | 2.066563e+06 | 5.289997 | 43.279999 | 0.0 | 1.482604e+06
+| 0.0 | 212071.9951 | Sell
+TWTR | 1.907771e+06 | 2.093922e+06 | 2.509998 | 19.549999 | 0.0 | 1.694676e+06
+| 0.0 | 213094.9891 | Sell
+YHOO | 2.119843e+06 | 2.119843e+06 | 5.289997 | 43.279999 | 0.0 | 1.907771e+06
+| 0.0 | 212071.9951 | Sell
+
+475 rows × 9 columns
+
+[code]
+
+ bk["Portfolio Value"].groupby(level = 0).apply(lambda x: x[-1]).plot()
+
+[/code]
+
+
+
+A more realistic portfolio that can invest in any in a list of twelve (tech)
+stocks has a final growth of about 100%. How good is this? While on the
+surface not bad, we will see we could have done better.
+
+## Benchmarking
+
+Backtesting is only part of evaluating the efficacy of a trading strategy. We
+would like to **benchmark** the strategy, or compare it to other available
+(usually well-known) strategies in order to determine how well we have done.
+
+Whenever you evaluate a trading system, there is one strategy that you should
+always check, one that beats all but a handful of managed mutual funds and
+investment managers: buy and hold [SPY](https://finance.yahoo.com/quote/SPY).
+The **efficient market hypothesis** claims that it is all but impossible for
+anyone to beat the market. Thus, one should always buy an index fund that
+merely reflects the composition of the market. SPY is an **exchange-traded
+fund** (a mutual fund that is traded on the market like a stock) whose value
+effectively represents the value of the stocks in the S&P 500 stock index.
+By buying and holding SPY, we are effectively trying to match our returns with
+the market rather than beat it.
+
+I obtain data on SPY below, and look at the profits for simply buying and
+holding SPY.
+
+[code]
+
+ spyder = web.DataReader("SPY", "yahoo", start, end)
+ spyder.iloc[[0,-1],:]
+
+[/code]
+
+| Open | High | Low | Close | Volume | Adj Close
+---|---|---|---|---|---|---
+Date | | | | | |
+2010-01-04 | 112.370003 | 113.389999 | 111.510002 | 113.330002 | 118944600 |
+99.292299
+2016-09-01 | 217.369995 | 217.729996 | 216.029999 | 217.389999 | 93859000 |
+217.389999
+
+[code]
+
+ batches = 1000000 // np.ceil(100 * spyder.ix[0,"Adj Close"]) # Maximum number of batches of stocks invested in
+ trade_val = batches * batch * spyder.ix[0,"Adj Close"] # How much money is used to buy SPY
+ final_val = batches * batch * spyder.ix[-1,"Adj Close"] + (1000000 - trade_val) # Final value of the portfolio
+ final_val
+
+[/code]
+
+[code]
+
+ 2180977.0
+
+[/code]
+
+[code]
+
+ # We see that the buy-and-hold strategy beats the strategy we developed earlier. I would also like to see a plot.
+ ax_bench = (spyder["Adj Close"] / spyder.ix[0, "Adj Close"]).plot(label = "SPY")
+ ax_bench = (bk["Portfolio Value"].groupby(level = 0).apply(lambda x: x[-1]) / 1000000).plot(ax = ax_bench, label = "Portfolio")
+ ax_bench.legend(ax_bench.get_lines(), [l.get_label() for l in ax_bench.get_lines()], loc = 'best')
+ ax_bench
+
+[/code]
+
+
+
+Buying and holding SPY beats our trading system, at least how we currently set
+it up, and we haven't even accounted for how expensive our more complex
+strategy is in terms of fees. Given both the opportunity cost and the expense
+associated with the active strategy, we should not use it.
+
+What could we do to improve the performance of our system? For starters, we
+could try diversifying. All the stocks we considered were tech companies,
+which means that if the tech industry is doing poorly, our portfolio will
+reflect that. We could try developing a system that can also short stocks or
+bet bearishly, so we can take advantage of movement in any direction. We could
+seek means for forecasting how high we expect a stock to move. Whatever we do,
+though, must beat this benchmark; otherwise there is an opportunity cost
+associated with our trading system.
+
+Other benchmark strategies exist, and if our trading system beat the "buy and
+hold SPY" strategy, we may check against them. Some such strategies include:
+
+ * Buy SPY when its closing monthly price is aboves its ten-month moving average.
+ * Buy SPY when its ten-month momentum is positive. (**Momentum** is the first difference of a moving average process, or .)
+
+(I first read of these strategies [here](https://www.r-bloggers.com/are-r2s-
+useful-in-finance-hypothesis-driven-development-in-reverse/?utm_source=feedbur
+ner&utm_medium=email&utm_campaign=Feed%3A+RBloggers+%28R+bloggers%29).) The
+general lesson still holds: _don't use a complex trading system with lots of
+active trading when a simple strategy involving an index fund without frequent
+trading beats it._ [This is actually a very difficult requirement to
+meet.](http://www.nytimes.com/2015/03/15/your-money/how-many-mutual-funds-
+routinely-rout-the-market-zero.html?_r=0)
+
+As a final note, suppose that your trading system _did_ manage to beat any
+baseline strategy thrown at it in backtesting. Does backtesting predict future
+performance? Not at all. [Backtesting has a propensity for
+overfitting](http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2745220), so
+just because backtesting predicts high growth doesn't mean that growth will
+hold in the future.
+
+## Conclusion
+
+While this lecture ends on a depressing note, keep in mind that [the efficient
+market hypothesis has many
+critics.](http://www.nytimes.com/2009/06/06/business/06nocera.html) My own
+opinion is that as trading becomes more algorithmic, beating the market will
+become more difficult. That said, it may be possible to beat the market, even
+though mutual funds seem incapable of doing so (bear in mind, though, that
+part of the reason mutual funds perform so poorly is because of fees, which is
+not a concern for index funds).
+
+This lecture is very brief, covering only one type of strategy: strategies
+based on moving averages. Many other trading signals exist and employed.
+Additionally, we never discussed in depth shorting stocks, currency trading,
+or stock options. Stock options, in particular, are a rich subject that offer
+many different ways to bet on the direction of a stock. You can read more
+about derivatives (including stock options and other derivatives) in the book
+_Derivatives Analytics with Python: Data Analysis, Models, Simulation,
+Calibration and Hedging_, [which is available from the University of Utah
+library (for University of Utah students).](http://proquest.safaribooksonline.
+com.ezproxy.lib.utah.edu/9781119037996)
+
+Another resource (which I used as a reference while writing this lecture) is
+the O'Reilly book _Python for Finance_, [also available from the University of
+Utah library.](http://proquest.safaribooksonline.com.ezproxy.lib.utah.edu/book
+/programming/python/9781491945360)
+
+Remember that it is possible (if not common) to lose money in the stock
+market. It's also true, though, that it's difficult to find returns like those
+found in stocks, and any investment strategy should take investing in it
+seriously. This lecture is intended to provide a starting point for evaluating
+stock trading and investment, and I hope you continue to explore these ideas.
+
+## Problems
+
+### Problem 1
+
+_Devise a trading strategy as described in lecture based on moving-average
+crossovers (you do not need a stop-loss). Pick a list of at least 15 stocks
+that have existed since January 1st, 2010. Backtest your strategy with the
+stocks chosen and benchmark the performance of your portfolio against the
+performance of SPY. Are you able to beat the market?_
+
+### Problem 2
+
+_Realistically, with every trade a commission is applied. Read about how
+[commission](http://www.investopedia.com/terms/c/commission.asp) works, and
+modify the `backtest()` function in the lecture to allow multiple commission
+structures (flat fee, percentage of portfolio, etc.) to be simulated._
+
+_Additionally, our current moving average crossover strategy results in a
+trading signal triggering the moment two moving averages cross. We would like
+to make sure signals are more robust, either by:_
+
+ 1. _Triggering a trade when the moving averages differ by a fixed amount_
+ 2. _Triggering a trade when the moving averages differ by some amount of **(rolling) standard deviations**, which are defined by:_
+
+
+
+_(**pandas** does have means for computing rolling standard deviations.)
+Regarding the latter, if the moving averages differ by , a
+trading signal is sent. Modify the function `ma_crossover_orders()` so that
+these restrictions can be implemented. Specifically, you should have the
+ability to set how many days are in the window of the rolling standard
+deviation (it need not be the same as either the fast or slow moving average
+windows), and how many standard deviations the moving averages must differ by
+in order for a signal to be sent. (The current behavior of these functions
+should still be possible; in fact, it should be the default behavior.)_
+
+_Once these changes have been made, repeat problem 1, including a realistic
+commission scheme (consider looking up one from a brokerage firm) when
+simulating the performance of the portfolio, and requiring the moving averages
+differ by some fixed number or standard deviations in order for signals to be
+sent._
+
+### Problem 3
+
+_We did not set up our trading system to allow for shorting stocks. Short
+selling is much trickier, since losses from short selling are unlimited (a
+long position, on the other hand, limits losses to the total value of the
+assets purchased). Read about short selling
+[here](http://www.investopedia.com/university/shortselling/shortselling1.asp).
+Then modify the function `backtest()` to allow for short selling. How will the
+function decide how to conduct short sales, including how many shares to short
+and how to account for shorted stocks when conducting other trades? We leave
+this up to you to decide. As a hint, the number of shares being shorted can be
+represented internally in the function by a negative number._
+
+_Once this is done, repeat Problem 1, perhaps also using features implemented
+in Problem 2._
\ No newline at end of file
diff --git a/raw/Python object ids and mutable types.md b/raw/An exploration of Texas Death Row data.md
similarity index 100%
rename from raw/Python object ids and mutable types.md
rename to raw/An exploration of Texas Death Row data.md
diff --git a/raw/Getting started with Raspberry Pi - Building a Digital Photo Frame.md b/raw/Getting started with Raspberry Pi - Building a Digital Photo Frame.md
new file mode 100644
index 0000000..36cc084
--- /dev/null
+++ b/raw/Getting started with Raspberry Pi - Building a Digital Photo Frame.md
@@ -0,0 +1,1047 @@
+原文:[Getting started with Raspberry Pi - Building a Digital Photo Frame](https://paulstamatiou.com/getting-started-raspberry-pi/)
+
+---
+
+** 用一个小小的$35的电脑,你可以做什么,以及我是如何构建一个电子相框的**
+
+2012年年初,一个名字奇怪的有趣的单板计算机投入了市场。仅需低廉的$35,你就可以获得一个可以运行真正操作系统的全功能计算机。
+
+这就是所谓的树莓派(Raspberry Pi),它是一个名为树莓派基金会的英国慈善机构的心血结晶。在看到申请学习计算机科学的学生人数持续下降后,他们看到了对负担得起的计算机的需求。
+
+而事实证明,这个微小而廉价的全功能计算机有一个比预期大得多的受众。自创建以来,已经有多个型号了,包括$5的Pi Zero,另外,**已经售出了超过9百万的树莓派**。
+
+这是一篇很长的博文,因此,一路上我更有可能犯一些错。随时[在Twitter上知会我](http://twitter.com/Stammy "Paul Stamatiou on Twitter"),谢谢!
+
+
+
+- [它是什么?](#它是什么?)
+ - [这就像一块Arduino吗?并不是。](#这就像一块arduino吗?并不是。)
+ - [电量消耗](#电量消耗)
+ - [用于教育的Pi](#用于教育的pi)
+ - [Pi社区和资源](#pi社区和资源)
+- [有了Pi,你能做什么?](#有了pi,你能做什么?)
+ - [灰常多。](#灰常多。)
+ - [为什么我会有一个Pi?](#为什么我会有一个pi?)
+ - [所以,你可以拿它做什么?](#所以,你可以拿它做什么?)
+ - [Pi不大适合做什么?](#pi不大适合做什么?)
+- [开始](#开始)
+ - [操作系统安装](#操作系统安装)
+ - [你需要什么](#你需要什么)
+ - [零部件清单](#零部件清单)
+ - [Optional](#optional)
+ - [什么操作系统?](#什么操作系统?)
+ - [Imaging the microSD card](#imaging-the-microsd-card)
+ - [Benchmarking, overclocking and cooling](#benchmarking-overclocking-and-cooling)
+ - [Heatsinks](#heatsinks)
+- [使用I/O引脚](#使用io引脚)
+ - [Pi electronics 101](#pi-electronics-101)
+ - [Parts](#parts)
+ - [Setup & powering an LED](#setup--powering-an-led)
+ - [Control the LED](#control-the-led)
+ - [Using a transistor](#using-a-transistor)
+ - [Use a relay](#use-a-relay)
+ - [What's next?](#whats-next)
+- [做一个数码相框](#做一个数码相框)
+ - [with a 10" 1920x1200 display](#with-a-10-1920x1200-display)
+ - [We're going to need a display](#were-going-to-need-a-display)
+ - [Mounting the display](#mounting-the-display)
+ - [Mounting the Pi](#mounting-the-pi)
+ - [Displaying the photos](#displaying-the-photos)
+ - [Turning the display off](#turning-the-display-off)
+ - [Wiring up a button and fan](#wiring-up-a-button-and-fan)
+ - [What's next with the Pi Frame?](#whats-next-with-the-pi-frame)
+
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09364-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09366-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09460-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09465-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09147-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00108-1500.jpg)
+
+## 它是什么?
+###### 这就像一块Arduino吗?并不是。
+
+虽然我听说Pi多年了,但是我从没好好看看它,而精神上只是把它看成给爱好者玩的某种类型的Arduino开发板。我真是大错特错了。
+
+Arduino是一个开源的微控制器,有I/O引脚来控制其他电子产品。另一方面,Pi 仍然有那些I/O引脚 (GPIO),但还有一个闭源的ARM **S**ystem **o**n a **C**hip (SoC)。然而,Arduino更适合连接到立即可用的模拟传感器,而Pi最适合那些使用I2C或者SPI通信的串行口传感器。
+
+最快的Arduino运行在84MHz,而Pi 3运行在**1.2GHz**。
+
+所以,当我说Pi是一个全功能的计算机时,我想表达什么呢?当然,那么便宜的东西实际上并不可用 …… 它是。装备**64位 1.2GHz四核CPU**,Model 3 B拥有**1GB内存,802.11n标准的Wi-Fi和Bluetooth**,以及大量的端口:4个USB,HDMI,microSD等等。这允许它运行为ARM芯片优化的操作系统,而且目前有相当多可用的操作系统:
+
+* [Ubuntu MATE](https://ubuntu-mate.org/raspberry-pi/ "Ubuntu MATE for the Raspberry Pi 2 and Raspberry Pi 3")
+* [Raspbian](https://www.raspbian.org/ "Raspbian is a free operating system based on Debian optimized for the Raspberry Pi hardware.")和Raspbian Lite (用于Pi的优化过的Debian)
+* [OSMC](https://osmc.tv/2016/02/raspberry-pi-3-announced-with-osmc-support/ "Raspberry Pi 3 announced with OSMC support")和[OpenELEC](http://openelec.tv/) (媒体中心操作系统)
+* [Pidora](http://pidora.ca/ "Pidora is a Fedora Remix optimized for the Raspberry Pi computer.") (用于Pi的Fedora)
+* [Arch Linux ARM](https://archlinuxarm.org/platforms/armv8/broadcom/raspberry-pi-3 "Arch Linux, a lightweight and flexible Linux? distribution that tries to Keep It Simple.") (为ARM电脑优化的Arch)
+* [用于单板计算机的Chromium OS](http://www.chromiumosforsbc.org/) (你可能知道它,它作为Chrome OS,用于Chromebooks)
+* [Windows 10 IoT Core](https://developer.microsoft.com/en-us/windows/iot/win10/noobs) (一个为物联网提出的[Windows 10变体](https://developer.microsoft.com/en-us/windows/iot))
+* 即将推出:[官方Android支持](http://arstechnica.com/gadgets/2016/05/google-to-bring-official-android-support-to-the-raspberry-pi-3/) from Google
+
+作为对比,更小的$5的树莓派Zero (如下所示的v1.3和照相机模块)拥有单核**1GHz CPU和512MB内存** —— 仍然足以运行以上操作系统 —— 带有microSD插槽,一个迷你HDMI和两个微型USB端口。然而,它缺少板载网络,因此如果你想上网,那么仍需要一个带有USB Wi-Fi的USB上网卡。
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09417-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09429-1500.jpg)
+
+有许多树莓派的竞争者。有些功能更强大,更贵。[PINE64](https://www.pine64.com/ "A powerful 64-Bit expandable single board computer ¨C Starting at just $15")带有2GB内存和千兆以太网,2GHz的四核Odroid XU4带USB 3.0和千兆以太网,以及明目张胆克隆树莓派的香蕉派[1](#footnote-1)带有SATA连接。然而,它们都无法打败**Pi大大规模开发者和爱好者社区**。这使得它容易找到你正构建的项目的支持,或者使用Pi特有的软件项目。
+
+###### 电量消耗
+
+噢,这个玩意不怎么吃电!不像更大的计算机,对于24*7运行它,你也许要三思而行,**树莓派运行成本基本没有**。这使得将Pi作为永远在线的Linux服务器或者连接硬件的项目的基础极具吸引力。
+
+消耗取决于幸好,并且有很多方法可以更进一步的减少消耗[2](#footnote-2),但一块Pi 3 Model B闲时大概消耗1.4W,并在负载的适合上升到3.7W
+
+在加利福尼亚州,电费大概是每千瓦15美分,一整年负载运行一块Pi 3只需花费$5 (而闲时则只需每年$2)。一块Pi Zero将明显花费更少,因为它在负载时运行接近0.7W。为了便于比较,我的[4个磁盘的Synology NAS](https://paulstamatiou.com/storage-for-photographers-part-2/ "Storage for photographers Part 2")的运行大概花费_每月_$3.5。
+
+微小的Pi电力需求为好玩的移动和嵌入式应用打开了大门,因为它可以通过一个正常的USB电池组供好一会电。
+
+###### 用于教育的Pi
+
+在它在开发者和爱好者之中疯狂流行之前,Pi意欲成为教育负担得起的电脑。这种努力仍在继续,树莓派基金会试图把Pi带进学校,现在,有了新的努力,像[Kano电脑包](http://us.kano.me/)和[Pi-Top](http://pi-top.com/ "Pi-Top"),它们都是成功众筹的结果。
+
+[](https://turbo.paulstamatiou.com/uploads/2016/05/pstam_pitop_kits.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/05/pstam_kano_pc_kit_photo.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/05/pstam_kano_pc_photo.jpg)
+
+鉴于这些原因,对于Pi,有相当对面向教育的软件:用于音乐的[Sonic Pi](https://www.raspberrypi.org/learning/getting-started-with-sonic-pi/worksheet/),[Scratch](https://www.raspberrypi.org/learning/physical-computing-with-scratch/worksheet/ "Physical computing with Scratch"),[Processing](https://www.raspberrypi.org/learning/introduction-to-processing/worksheet/)和[Python](https://www.raspberrypi.org/learning/python-intro/),[Minecraft Pi](https://www.raspberrypi.org/learning/getting-started-with-minecraft-pi/worksheet/),[还有更多](https://www.raspberrypi.org/resources/)。当然,很多这些都可以用任何一台计算机来完成,但是Pi的价位使得它更易获得。
+
+###### Pi社区和资源
+
+一旦你决定好了开始玩树莓派,那么你会很高兴的知道你并不孤单。有大量的Pi hacker社区来帮助你设想、构建和调试你的项目,并且有丰富的Pi特定的硬件来助力你的项目。
+
+* [树莓派subreddit](http://reddit.com/r/raspberry_pi/)
+* [树莓派论坛](https://www.raspberrypi.org/forums/)
+* [MagPi杂志](https://www.raspberrypi.org/magpi/)
+* [树莓派eLinux维基](http://elinux.org/RPi_Hub)
+* [adafruit](https://blog.adafruit.com/category/raspberry-pi/) - 一个在线电子商店,它有一个博客,上面有大量关于Pi的新闻和教程
+* [Pi故障排除指南](http://elinux.org/R-Pi_Troubleshooting)
+
+
+## 有了Pi,你能做什么?
+##### 灰常多。
+
+#### 为什么我会有一个Pi?
+
+在我看到了我一个朋友使用树莓派Zero来用他的Amazon Echo控制Sonos设置,我变得对Pi好奇起来。鉴于我有一个[Sonos系统](https://paulstamatiou.com/stuff-i-use/ "Stuff I use"),我想要用相同的设置启动。虽然我可以很容易的只把必要的节点服务器放在我的NAS上,但是我更想让那个盒子隐藏在我的路由器背后以获得安全性。我无所谓为一个简单的基于Pi的服务器转发端口。
+
+那周,我订了一个Pi 3 Model B,并[设置了我自己的Sonos/Echo集成](https://github.com/rgraciano/echo-sonos "Amazon Echo integration with Sonos") —— 虽然我原本想要那个小小的Pi Zero,但它在那时已经售罄了。
+
+
+
+
+
+
+然后,我开始思考,用Pi还能做什么……
+
+#### 所以,你可以拿它做什么?
+
+第一个以及最明显的路:**把它当成一台电脑使用**。它绝对不是你房子里最快的机器,但它可以很好的完成基本任务。虽然你可以将它挂到一个全尺寸鼠标、键盘和桌面显示上,但是对于更小的展示,还有大量的选择,包括触摸屏,这使得Pi适合各种项目。
+
+但是请记住,它有那些I/O引脚。以Pi为中心,无数的硬件改装和物联网创意可以很容易地成为现实:
+
+* **数码相框**
+
+ 数码相框并非新玩意;你或许还记得you probably remember the crappy versions years ago where you stuck an SD card in to play your photos. Well times have changed and tons of connected digital photo frames are on the market now. There's the $299 [Electric Objects EO1](https://www.electricobjects.com/), the $999+ [Klio](http://www.klioart.com/) and the $445+ [Meural digital canvas](https://meural.com/). So now lots of folks have turned to the Pi to build their own versions, including me (at the very bottom of this post!).
+
+ 
+
+* **魔镜**
+
+ Smart mirrors are probably the most popular Raspberry Pi project in existence right now. They bring memories of futuristic movie interfaces and are relatively simple to build, especially with a large community developing them and [releasing software](https://github.com/MichMich/MagicMirror "Magic Mirror 2 the open source modular smart mirror platform") to drive them. In a nutshell: put a display running a full-page browser with a dark UI displaying info like news, weather or whatever is important to you behind a 2-way mirror and hang it in your house somewhere. There are lots of guides about this online: [1](https://www.reddit.com/r/raspberry_pi/comments/3oktfu/magic_mirror_how_to/ "Magic Mirror how to"), [2](http://michaelteeuw.nl/post/83188136918/magic-mirror-part-v-installing-the-raspberry-pi
"Magic Mirror: Part V - Installing the Raspberry Pi"), [3](http://innate.cc/ "Smart Mirror Mini Form Factor Update"), [4](http://blog.dylanjpierce.com/raspberrypi/magicmirror/tutorial/2015/12/27/build-a-magic-mirror.html), [5](https://medium.com/@maxbraun/my-bathroom-mirror-is-smarter-than-yours-94b21c6671ba).
+
+ 
+ 
+ Probably the [most popular magic mirror build](https://medium.com/@maxbraun/my-bathroom-mirror-is-smarter-than-yours-94b21c6671ba). This one was done with a Fire TV Android stick since the creator couldn't find a Pi Zero in stock at the time. Same concept though.
+
+* Roll your own motion detecting Dropcam [with motionEyeOS](https://github.com/ccrisan/motioneyeos/wiki "motionEyeOS is a Linux distribution that turns a single-board computer into a video surveillance system.")
+* [创建一个闹钟](https://georgecushen.com/spotify-alarm-clock-raspberry-pi-ubuntu-linux/ "Wake Up with Spotify Alarm Clock for Raspberry Pi") that plays music from Spotify
+* Setup your own VPN server with [PiVPN (OpenVPN)](http://www.pivpn.io) for when you're not at home and connect to unsecure coffee shop Wi-Fi networks.* Build your own [portable Pi Desktop computer](https://learn.adafruit.com/10-raspberry-pi-desktop) or [tablet](http://francescopochetti.com/pipad-build-tablet-raspberry-pi/):
+
+ 
+ 
+* [Build a document scanner](http://www.instructables.com/id/Raspberry-Pi-Based-Document-Scanner-With-Automatic/ "Raspberry Pi Document Scanner With Automatic Upload to Dropbox") that automatically uploads to Dropbox
+* Home theater PC with OpenElec, OSMC, Kodi, [RasPlex](https://github.com/RasPlex/RasPlex/releases "Rasplex is a community driven port of Plex Home Theater for the Raspberry Pi") or even [Android TV](https://github.com/peyo-hd/device_brcm_rpi3 "Android TV for Raspberry Pi 3") ([see video](https://www.youtube.com/watch?v=NMDf5thqoMk "Raspberry Pi 3 Running Android Tv OS and N64 , SNES emulator test"))
+* Create an ["Onion Pi" Tor proxy](https://learn.adafruit.com/onion-pi/overview "Make a Raspberry Pi into a Anonymizing Tor Proxy")
+* Setup an Ad blocker for your whole network with [Pi Hole](https://pi-hole.net/)
+* Program your own Pi-based robot with the [GoPiGo robot kit](http://www.dexterindustries.com/GoPiGo/):
+
+ 
+
+* DIY plant automatic watering system: [1](http://www.instructables.com/id/Automatic-Plant-Watering-and-Soil-Moisture-Sensing/ "Automatic Plant Watering and Soil Moisture Sensing"), [2](http://www.instructables.com/id/Raspberry-Pi-Irrigation-Controller/ "Raspberry Pi Irrigation Controller"), [3](https://hackaday.io/project/2711-autonomous-watering-system "Autonomous watering system"), [4](https://blog.serverdensity.com/automatically-watering-your-plants-with-sensors-a-pi-and-webhooks/)
+* Make a [personal voice assistant](http://www.instructables.com/id/Raspberri-Personal-Assistant/?ALLSTEPS)
+* Make a [portable gaming console](http://www.instructables.com/id/Raspberry-Pi-Portable-Games-Console/step6/Assembling-the-Console/), [Porta Pi Arcade system](http://www.instructables.com/id/Build-your-own-Mini-Arcade-Cabinet-with-Raspberry-/) or [Game Boy Zero](https://www.raspberrypi.org/blog/game-boy-zero/ "Game Boy Zero") using with [RetroPie](https://retropie.org.uk/).
+
+ 
+ 
+ 
+
+* Turn your Raspberry Pi into a gaming console with the [Lakka](http://www.lakka.tv/ "The open source game console") Linux distro
+* Have your Pi run a [Twitter bot that tweets photos](http://blog.bandwidth.com/actually-using-your-raspberry-pi-part-4-twitter-bot/) from the Pi Camera
+* DIY Pi-controlled espresso machine using [iSPRESSO](http://ispresso.net/ "iSPRESSO is an appliance modification comprised of Raspberry Pi computer, solid state relays, temp sensor, buttons and a display, a custom Printed Circuit Board, and custom linux shell scripts and a good bit of python code")
+* [DIY Amazon Echo](https://github.com/amzn/alexa-avs-raspberry-pi) using Alexa voice service:
+
+ 
+
+* Setup [wireless electrical outlets via RF modules](https://timleland.com/wireless-power-outlets/ "Wireless power outlets") made for the Pi or hack your own [voice-controlled electrical outlets](http://www.instructables.com/id/Wireless-Multi-Channel-Voice-Controlled-Electrical/):
+
+ 
+
+* Create your own [BitTorrent downloading box](http://www.howtogeek.com/142044/how-to-turn-a-raspberry-pi-into-an-always-on-bittorrent-box/ "How to Turn a Raspberry Pi into an Always-On BitTorrent Box")
+* Make a [Raspberry Pi server cluster](http://makezine.com/projects/build-a-compact-4-node-raspberry-pi-cluster/):
+
+ 
+
+* Have the Raspberry Pi [open the door with a Slack chat command](http://blog.tryolabs.com/2016/06/01/raspberrypi-slack-our-humble-contribution-to-the-offices-laziness/ "RASPBERRY PI + SLACK: OUR HUMBLE CONTRIBUTION TO THE OFFICE¡¯S LAZINESS")
+* Use it as a server for [Home Assistant](https://home-assistant.io/) or [pimatic](https://pimatic.org/ "pimatic is a home automation framework that runs on node.js. It provides a common extensible platform for home control and automation tasks") for all the connected devices and appliances in your home or [have it run HomeBridge](https://github.com/nfarina/homebridge/wiki/Running-HomeBridge-on-a-Raspberry-Pi "Homebridge is a lightweight NodeJS server you can run on your home network that emulates the iOS HomeKit API.") to allow Siri to control more home automation devices.
+* Use your Pi to host any of these [free web applications yourself](https://github.com/Kickball/awesome-selfhosted/blob/master/README.md)
+** DIY [Seenote](https://www.getseenote.com/) digital sticky note / to-do list
+* [Write a Python web server](http://mattrichardson.com/Raspberry-Pi-Flask/) to control electronics connected to the Pi's GPIO pins from any browser
+* Play a [MIDI file over a Tesla coil](https://www.youtube.com/watch?v=KhvExaTCXHA)
+* Create a [high-res networked outdoor camera](http://blog.wq.lc/16-megapixel-outdoor-network-camera-on-the-cheap/ "16 Megapixel Outdoor Network Camera on the Cheap"):
+
+ 
+
+* And [many](https://hackaday.io/list/3424-raspberry-pi-projects) more [projects](http://www.instructables.com/id/Raspberry-Pi-Projects/)...
+
+#### Pi不大适合做什么?
+
+For one, the Pi is a full computer and while it does not consume much power for a computer, it can still be overkill compared to an Arduino for simple hardware projects that don't require running an OS, GUI or networking.
+
+The Pi is not without a few tradeoffs. Ethernet networking (which is a 100Mbps link) and disk access (if you attach any storage device via USB) all go through the USB bus. So any simultaneous Ethernet network traffic and storage device usage will be bottlenecked by the same bus. Wi-Fi does not route through the USB bus, but you will still typically only see around 20-40Mbps over Wi-Fi instead of the theoretical 150Mbps for 802.11n.
+
+In short, the Pi is not quite the best for intense I/O and networking uses but can get the job done when speed is not mission critical.
+
+
+
+## 开始
+
+##### 操作系统安装
+
+
+
+#### 你需要什么
+
+Okay, let's get started! I'm going to assume that at the very least you'd like to just install some operating system on a Pi. First you'll need to pick what Raspberry Pi and accessories to buy ¡ª you didn't think you could get away with just buying the Pi itself did you?
+
+But what Pi do you want? There's the larger and more powerful Pi 3 Model B and the tiny Pi Zero. I ended up getting both, but if the cost difference isn't a big issue for you I'd suggest starting with the Pi 3B. With integrated Wi-Fi as well as full-size HDMI and USB ports, it's almost a turn-key solution.
+
+###### 零部件清单
+* **Raspberry Pi 3 Model B** ($35)
+
+买自:
+[The Pi Hut](https://thepihut.com/collections/raspberry-pi/products/raspberry-pi-3-model-b),
+[MCM](http://www.mcmelectronics.com/product/83-17300),
+[Adafruit](https://www.adafruit.com/product/3055), [Pimoroni](https://shop.pimoroni.com/collections/raspberry-pi/products/raspberry-pi-3),
+[Amazon](http://www.amazon.com/Raspberry-Pi-RASP-PI-3-Model-Motherboard/dp/B01CD5VC92/ref=as_li_ss_tl?ie=UTF8&keywords=raspberry%20pi&qid=1464412585&ref_=sr_1_4&s=pc&sr=1-4&linkCode=ll1&tag=paulstamatiou-20&linkId=a9639c24281e760bc09a6a807c37bcd6)
+
+
+你的Pi项目的核心。Unfortunately, Amazon themselves do not sell it (just third parties, often at a higher price) so it's probably best to purchase from one of the official partners above.
+
+* [2.5A micro-USB power adapter](http://www.amazon.com/CanaKit-Raspberry-Supply-Adapter-Charger/dp/B00MARDJZ4/ref=as_li_ss_tl?ie=UTF8&refRID=1N7W6CHN64Q9EYQBEN21&linkCode=ll1&tag=paulstamatiou-20&linkId=34d578791be1daf5282c214b30895f06 "CanaKit 5V 2.5A Raspberry Pi 3 Power Supply / Adapter / Charger (UL Listed)") ($10)
+
+Just about any micro-USB power adapter should work, but the more power hungry devices you connect to your Pi the more critical a good power supply becomes. Lots of Raspberry Pi issues that crop up end up being caused by a bad power source, so it's best not to mess around and get a good one. While I have a ton of USB chargers already, I opted not to get a powerful one used by many Pi folks without issue. Another option is this [Anker Dual USB charger](http://www.amazon.com/Anker-Charger-PowerPort-Foldable-iPhone/dp/B012WMWPJW/ref=as_li_ss_tl?ie=UTF8&qid=1464558272&sr=8-1&keywords=wall+charger+anker&refinements=p_89%3AAnker&linkCode=ll1&tag=paulstamatiou-20&linkId=5064fe7ae75d29f829a0f59701c87a32 "Anker 24W Dual USB Wall Charger") with 2 2.4A USB ports if you have other devices to run simultaneously, which I also ended up purchasing later on for my Pi Frame project below.
+
+* [16GB microSD memory card](http://www.amazon.com/SanDisk-Extreme-MicroSDXC-Adapter-SDSQXNE-064G-GN6MA/dp/B013CP5F90/ref=as_li_ss_tl?ie=UTF8&keywords=64GB%20sandisk%20extreme%20microsd&qid=1464419155&ref_=sr_1_1&sr=8-1&linkCode=ll1&tag=paulstamatiou-20&linkId=0df538f214b6b8fcba4d469830c692a0 "SanDisk Extreme 16GB microSDHC UHS-1") ($10)
+
+Anything larger than 8GB should be fine, but it's very important to get a fast card (Class 10 at least) from a reputable brand. Faulty or slow microSD cards are often a source of Raspberry Pi woes, so it's best not to skimp around in this department either.
+
+I already had a bunch of large and fast microSD cards from my GoPro, so I just used one: a 64GB SanDisk Extreme. If you want something even faster you can opt for the SanDisk Extreme Plus or [SanDisk Extreme Pro version](http://www.amazon.com/SanDisk-Extreme-Memory-Speeds-Ready-SDSDQXP-064G-G46A/dp/B008HK1YAA/ref=as_li_ss_tl?ie=UTF8&qid=1464420526&sr=8-3&keywords=64GB+sandisk+extreme+plus+microsd&linkCode=ll1&tag=paulstamatiou-20&linkId=6b513c25ae825406b86d59cf941371f3 "SanDisk Extreme Pro 16GB MicroSDHC memory card").
+
+* **You'll also need a keyboard, mouse, HDMI cable and a monitor or TV[3](#footnote-3).** I'm assuming you probably already have these lying around. A USB keyboard and mouse are ideal so there's no issues getting it to connect during initial setup. As for the display, you really only need it for the setup process. After that you can install a VNC server and access it from any computer (if your intended Pi use doesn't require a display all the time).
+
+###### Optional
+* [Case for the Raspberry Pi 3](http://www.amazon.com/Raspberry-Pi-Case-Black-fits/dp/B00UW2G1BS/ref=as_li_ss_tl?s=pc&ie=UTF8&qid=1464423290&sr=1-3&keywords=raspberry+pi+3+case+canakit&linkCode=ll1&tag=paulstamatiou-20&linkId=f6330a1e3c8876702f340774b21c3170 "Raspberry Pi Case") ($9)
+
+There are a million to chose from so I encourage you to search around though to see what's out there.
+
+* [Tiny wireless keyboard/trackpad](http://www.amazon.com/FAVI-FE02RF-BL-Wireless-Keyboard-SmartStick/dp/B0090BTY8Y/ref=as_li_ss_tl?ie=UTF8&psc=1&redirect=true&ref_=oh_aui_detailpage_o02_s00&linkCode=ll1&tag=paulstamatiou-20&linkId=2035b5790be962b2b8dd82dac8bfccc5 "FAVI FE02RF-BL Mini 2.4GHz Wireless PC / Tablet Keyboard") ($32)
+
+While I have a full-size USB keyboard and mouse, I ended up getting this tiny keyboard mouse combo for casual usage. Easy to hide away when not in use.
+
+#### 什么操作系统?
+
+There are lots of operating systems to choose from when it comes time to image your microSD card and start the installation process. The typical Raspberry Pi setup advice involves [installing NOOBS](https://www.raspberrypi.org/documentation/installation/noobs.md) which makes it easy to select between Raspbian, Pidora, OpenELEC, OSMC, RISC OS and Arch Linux. Most newcomers select the Debian-based Raspbian Jessie, the official operating system for the Raspberry Pi.
+
+I used Raspbian for a bit but then Ubuntu MATE 16.04 optimized for the Pi 3 came out. I went with Ubuntu MATE largely for a superficial reason ¡ª I like how it looked out of the box. ¡¥\_(¥Ä)_/¡¥
+
+There are a few downsides compared to Raspbian ¡ª hardware acceleration seems to be a bit experimental, so I would not suggest this OS if you plan on using it for any Home Theater PC needs. Also not every Ubuntu package is ARM processor friendly so you may find less applications you can use at the moment on Ubuntu MATE. Also Firefox is pretty slow on Ubuntu MATE so you'll want to install another browser like Midori (or Chromium once they solve their current crash issue).
+
+#### Imaging the microSD card
+
+This part is usually harder than it needs to be. Typically you would download the image and use a command line tool like `dd` to manually image the card. You can't just drag the file to the microSD card.
+
+I ended up going the hard route with the command line to image my card with Ubuntu MATE, but I'll be listing an easier option below.
+
+* Download the [Ubuntu MATE 16.04 LTS image made for the Raspberry Pi](https://ubuntu-mate.org/download/). You should get a file named `ubuntu-mate-16.04-desktop-armhf-raspberry-pi.img.xz`.
+* Uncompress the image. You can install the command line tool [unxz](http://tukaani.org/xz/) or use a GUI app like [The Unarchiver](http://unarchiver.c3.cx/unarchiver). I chose the latter:
unxz ubuntu-mate-16.04-desktop-armhf-raspberry-pi.img.xz
+```
+* Plug in your microSD card (you'll probably get an SD card adapter with any card you purchased) and run
diskutil list
+```
+* Identify the disk for your microSD card. This should be something like `/dev/disk4`, **not** `/dev/disk4s1` (The "s" part denotes the partition and we want the whole disk). Triple-check that this is the correct device and size. You may want to eject it and put it back in and verify that the item is removed from `diskutil list` when you do this. You don't want to overwrite the wrong disk!
+* If this disk is not listed as being FAT32, you will need to format it as DOS FAT32. You can do this in OS X by opening up Disk Utility, selecting the microSD card, clicking Erase and then selecting MS-DOS (FAT).
+* Unmount the disk, with the "X" being the number you just identified:
diskutil unmountDisk /dev/diskX
+```
+* Now we get to start the actual imaging process. Verify the name and location of the downloaded .img file you extracted, and enter in the correct disk location (the /dev/diskX part) and run this:
sudo dd bs=1M if=~/Desktop/ubuntu-mate-16.04-desktop-armhf-raspberry-pi.img of=/dev/diskX
+```
+* Alternatively you can try an even faster method by using the raw disk location instead of the buffered disk identifier. Just add an "r" before the disk like so: /dev/rdisk4. This may not work for everyone:
+
sudo dd bs=1M if=~/Desktop/ubuntu-mate-16.04-desktop-armhf-raspberry-pi.img of=/dev/rdiskX
+```
+* **This will take a long time.** For my 64GB card it took 48 minutes using the first approach (not the rdisk method). You get no status from the dd command while it's working but you can press CTRL+T to get an update.
+* When completed you can pull out the card and put it in your Raspberry Pi! If you have any questions about this process or are not using a Mac, there are lots of more detailed guides online like [this one](http://elinux.org/RPi_Easy_SD_Card_Setup "RPi Easy SD Card Setup") and [this one](http://www.tweaking4all.com/hardware/raspberry-pi/install-img-to-sd-card/ "Raspberry Pi ¨C How to get an Operating System on a SD-Card").
+
+If you're dying to just get started immediately you can [buy a microSD card with NOOBS preinstalled](https://www.adafruit.com/products/1583) or [create your own](https://www.raspberrypi.org/documentation/installation/noobs.md) and install Raspbian Jessie or another OS instead of Ubuntu MATE.
+
+**But there's an easier way for OS X users**: a new tool called [ApplePi-Baker](http://www.tweaking4all.com/software/macosx-software/macosx-apple-pi-baker/ "MacOS X - ApplePi Baker - Prep SD-Cards for IMG or NOOBS"). It's ridiculously easy to use. It automatically detected my microSD card and all I had to do was select the extracted img file. It did the rest and my card was ready to use after a few minutes.
+
+
+
+
+#### 开机!
+
+With your new SD card ready to go, slide it in your Raspberry Pi 3 and power it up. It should boot into the setup wizard for whichever OS you chose. This part should be a breeze. After a short while you'll be greeted with your new OS! Take some time to browse around and get it setup to your liking. But you'll first want to resize the file system. You'll find it in the Ubuntu MATE welcome dialog here:
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-ubuntu-mate-first-boot.jpg)
+
+
+#### Set a DHCP reservation
+
+The majority of how I work with my Pi is actually over SSH and VNC rather than directly using it on a display. As such it's important that I can always find the Pi on my network at the same IP address. To do this I added a DHCP reservation with the AirPort Utility (I have an AirPort Extreme).
+
+
+
+
+
+To do this you'll need to know the MAC address of your Raspberry Pi. You can do this by running `ifconfig eth0` if you're connected via Ethernet, or `ifconfig wlan` if connected via Wi-Fi.
+
+Now you can always ssh into your Pi from any computer on your network (or from any computer if you also setup port forwarding) with the same local IP. The default username for Raspbian is `pi` but you will have set your own username for Ubuntu MATE.
+
+```sh
+ssh pi@10.0.1.46
+```
+
+#### Setting up VNC
+
+Unless you have a dedicated display for your Pi, it will probably be annoying to constantly have it plugged into your TV. Installing a VNC server on the Pi and a VNC client on another computer will let you see and control the Pi with the window manager GUI instead of via SSH command line.
+
+While either SSH'd in or directly on your Pi, install the Tight VNC server:
+
+```sh
+sudo apt-get update
+sudo apt-get install tightvncserver
+tightvncserver
+```
+
+Now you need to enable the VNC server on the Pi. While it's sufficient to just type `tightvncserver` to run it, you'll want to customize a few things to get a higher resolution display, especially if you're accessing it on your LAN. Running the following command will setup a virtual screen with resolution of 1920x1080. You can use any screen resolution you like here within reason:
+
+```sh
+stammy@rpi:~$ vncserver :1 -geometry 1920x1200 -depth 24
+
+New 'X' desktop is rpi:1
+
+Starting applications specified in /home/stammy/.vnc/xstartup
+Log file is /home/stammy/.vnc/rpi:1.log
+```
+
+If you need to kill the server and change settings you can run `vncserver -kill :1`
+
+The VNC server is now running on display **:1**. Download the [VNC Viewer client from realVNC](http://www.realvnc.com/download/) for your Mac and open it up. Type in the IP of your Pi on the network and append the :1 screen, like this:
+
+
+
+
+Type in the password for your Pi and you're set! You'll quickly notice it's not quite as snappy as if you were using your Pi with a physical display but it can get the job done.
+
+
+
+Ubuntu MATE 16.04 running on a Raspberry Pi 3 over VNC
+
+
+Now you can fully control your Pi without the need for a hardware display. You will need to manually start the VNC server on the Pi whenever the Pi is rebooted. I'm usually logged in via SSH and rarely reboot so it's not a huge deal for me to run that line every now and then. But you can configure it to [automatically run at boot](http://elinux.org/RPi_VNC_Server).
+
+But you might be thinking.. **doesn't OS X have it's own screen sharing utility?** Why do I need to install another app for this?
+
+You're right! OS X is native VNC capable. To get this working we need to make the Pi discoverable via Bonjour and have it to broadcast it's new VNC support in a way that OS X can understand.
+
+```sh
+sudo apt-get install netatalk avahi-daemon
+```
+
+We're going to install `netatalk`[4](#footnote-4) which sets up the Apple Filing Protocol so we can also manipulate files on your Pi directly from the OS X Finder. If you're following this guide with Ubuntu MATE, you can leave off the `avahi-daemon` part as Ubuntu seems to come preinstalled with Avahi, the networking service discovery daemon.
+
+At this point your Raspberry Pi should be visible and accessible on the local network with your Mac! However, to be able to see the screen sharing capability advertised here you'll need to modify a file:
+
+```sh
+sudo nano /etc/avahi/services/rfb.service
+```
+
+Paste this configuration below and save. We're telling the avahi daemon about RFB (remote framebuffer.. VNC basically) and what port it works on.
+
+```xml
+
+
+
+ %h
+
+ _rfb._tcp
+ 5901
+
+
+```
+
+
+And then restart the daemon:
+
+```sh
+sudo /etc/init.d/avahi-daemon restart
+```
+
+You should now be see a new **Share Screen...** button. Click on that, type in your Pi password and you can now easily VNC into your Pi natively.
+
+
+
+
+Raspberry Pi visible in the OS X Finder
+
+This approach uses netatalk/AFP for sharing your Pi on the network so it will only work for Macs. If you'd like to share files for Windows machines, you'd want to setup Samba sharing. Also, AFP is technically deprecated, so a future-proof solution would be to setup SMB2... but I've always had a heck of a time getting it to work flawlessly and AFP works great for now.
+
+
+
+## 看看别人都用它做什么
+
+##### 一起做些什么
+
+
+#### Turning it into a NAS
+
+Now that you have your Pi and its files completely accessible via the OS X Finder, wouldn't it be neat to add more storage to your Pi, share that volume and backup to it? While I personally don't use my Pi for this ¡ª I [setup a larger Synology 4-disk NAS system](https://paulstamatiou.com/storage-for-photographers-part-2/ "Storage for Photographers (Part 2) - How a 12TB Synology NAS changed my digital life") for my terabytes of photos ¡ª it [can be done](http://www.howtogeek.com/139433/how-to-turn-a-raspberry-pi-into-a-low-power-network-storage-device/ "How to Turn a Raspberry Pi into a Low-Power Network Storage Device") with [a Raspberry Pi](http://www.techradar.com/how-to/computing/how-to-make-a-mac-time-capsule-with-the-raspberry-pi-1319989 "How to make a Mac Time Capsule with the Raspberry Pi"). Just don't expect it to be fast.
+
+
+
+#### Storage for Photographers (Part 2)
+
+How a 12TB Synology NAS changed my digital life
+
+
+](https://paulstamatiou.com/storage-for-photographers-part-2/ "Storage for Photographers (Part 2)")
+
+There are many small and energy efficient storage options from USB sticks to external laptop and desktop hard drives and SSDs. Keep in mind that **the Raspberry Pi 3 only has USB 2.0** so you won't get the entire speed benefit of an SSD. And for the smaller drives that don't require their own power source, you will still actually want to plug it into a powered USB hub before plugging into the Pi to make sure you don't cause stability problems by stealing too much juice from the Pi itself.
+
+The Raspberry Pi ended up getting so popular that Western Digital actually created a more efficient drive _just_ for the Pi. Called the [WD PiDrive](http://wdlabs.wd.com/products/wd-pidrive-314gb/) it's a 314GB hard drive (314GB as in ¦Ð, get it?) with a native 7mm USB connection. Unfortunately, it's pricey for how many gigabytes you get.
+
+As you might expect with a real NAS, **you can connect the Raspberry Pi to a UPS battery backup**: either a [real desktop-class UPS](https://melgrubb.com/2014/09/05/raspberry-pi-home-server-part-15power-failures/ "Raspberry Pi Home Server: Part 15¨CPower Failures") or a [tiny add-on board like this](http://www.modmypi.com/raspberry-pi/breakout-boards/pi-modules/ups-pico "UPS PIco - Uninterruptible Power Supply & I2C Control HAT") or [this](https://www.pi-supply.com/product/pi-ups-uninterrupted-power-supply-raspberry-pi/ "Pi UPS ¨C Uninterrupted Power Supply for Raspberry Pi") that sits on top of the Pi and has a cell-phone battery with enough battery to let your Pi run for a few hours and safely shut down.
+
+If you don't need the data portion of a real UPS system (being able to tell your Pi it's now running on battery and should shut off soon), you can just get a [good USB battery pack](https://www.amazon.com/Powerful-10000mAh-Anker-PowerCore-Technology/dp/B013HSQXZC/ref=as_li_ss_tl?srs=2528932011&ie=UTF8&qid=1464849921&sr=8-4&keywords=anker+powercore%2B&linkCode=ll1&tag=paulstamatiou-20&linkId=e2f04b6418a0bbb1e5b7890210d64b70 "Anker PowerCore+ 10050 Premium Aluminum Portable Battery Charger ") that you always keep plugged in. Make sure you get a reputable one with the appropriate circuitry to support pass-through charging or build your own with [this PowerBoost circuit](https://www.adafruit.com/products/2465 "PowerBoost 1000 Charger - Rechargeable 5V Lipo USB Boost @ 1A - 1000C") and a 3.7V LiPo battery.
+
+You can also setup your Pi to be a [Time Machine backup destination](https://pwntr.com/2012/03/03/easy-mac-os-x-lion-10-7-time-machine-backup-using-an-ubuntu-linux-server-11-10-12-04-lts-and-up/ "
+Easy Mac OS X (Mountain) Lion and Mavericks 10.7, 10.8 and 10.9 Time Machine backup using an Ubuntu Linux server [11.10, 12.04 LTS and up]") on the network and you can even [install CrashPlan](https://gist.github.com/n8henrie/37d96807e31d94ca0464 "Set up CrashPlan on Raspberry Pi (Raspbian Jessie)") to have all your Pi's files backed up to the cloud as well. But be warned it won't be particularly fast.
+
+###### Mounting and sharing a USB drive
+
+Regardless of what drive you get, you'll want to mount it and have netatalk share it so your Mac can access it. While Ubuntu MATE has some automounting stuff, I prefer to disable it and proceed the old-fashioned way. On Ubuntu MATE (not via SSH, you technically can with gsettings but it didn't work for me), type `dconf-editor` in the terminal to open the GUI dconf editor. Browse to `org.gnome.desktop.media-handling` in the left pane and uncheck `automount` and `automount-open`. Reboot.
+
+* Prepare your USB device by formatting it to ExFAT if it's not already. If you're not using Ubuntu MATE on your Pi, you will want to install this package to add support for ExFAT mounting: `sudo apt-get install exfat-fuse`
+* Plug in your USB device and type in `sudo blkid`:
+
+
stammy@rpi:~$ sudo blkid
+[sudo] password for stammy:
+/dev/mmcblk0: PTUUID="580a66ff" PTTYPE="dos"
+/dev/mmcblk0p1: SEC_TYPE="msdos" LABEL="PI_BOOT" UUID="4442-965D" TYPE="vfat" PARTUUID="580a66ff-01"
+/dev/mmcblk0p2: LABEL="PI_ROOT" UUID="e440adac-fcf9-4b68-9f94-6bfd030f60b3" TYPE="ext4" PARTUUID="580a66ff-02"
+/dev/sda1: UUID="9C33-6BBD" TYPE="exfat"
+```* We're looking for the UUID of the USB device so we can mount the drive based on it's unique id instead of it's location, so it will always mount flawlessly. In this case my drive is the last line with type `exfat`, since it is a large 128GB SDXC card that I plugged in (not the micro-SD card but a USB card reader and SD card just to test this out). You can verify that this is the correct line item by ejecting and running the `sudo blkid` command again to see that the line vanishes.
+* Create a new directory where we will mount the drive and then have your user account own it. If you are using the default pi username it will just be the following:
+
+
sudo mkdir /usb-drive
+sudo chown -R pi:pi /usb-drive
+```* Now for the real work, we need to add a line to our file systems table file. It is _very important_ that this is typed correctly with the correct UUID and filesystem type for your drive. If this is incorrect your Raspberry Pi will get stuck at boot and you won't even be able to SSH in, you'll have to enter emergency mode to fix the file.
+
+
sudo vim /etc/fstab
+```
+ 
+ * Now add this line to the bottom of your fstab file, making sure to replace XXXX-XXXX with the UUID from the blkid command earlier and using the correct file system type (vfat, exfat, etc):
+
+
UUID=XXXX-XXXX /usb-drive exfat auto,nofail,uid=1000,gid=100,umask=0002,rw 0 0
+```* Now we'll add this new drive as an item for netatalk to share. You'll need to edit this file:
+
+
sudo vim /etc/netatalk/AppleVolumes.default
+```
+ 
+ * Scroll to the very bottom and add this line representing the new persistent mount point for your USB storage device:
+
+
/usb-drive "USB Drive"
+```* Reboot. Your drive should now be shared and accessible on the network!
+
+#### Benchmarking, overclocking and cooling
+
+While I won't dwell on this too much, it's possible to overclock your Raspberry Pi to achieve higher CPU and RAM speeds. Why would you want to overclock your Pi? You might want to squeeze some extra performance from a CPU-limited process like video transcoding or the like. Or you might just want to see if you can overclock it for fun.
+
+There's a [simple Pi benchmark script](https://github.com/aikoncwd/rpi-benchmark) that includes overclocking documentation so you can run before and after your overclock to measure benefits. There are also quite a few guides on the topic if you're really curious: [Raspberry Pi 3 Overclocking](http://www.jackenhack.com/raspberry-pi-3-overclocking/ "Raspberry Pi 3 Overclocking") and [Pi 3 Overclocking, Stability Testing & Cooling](https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=144391&p=960651 "Pi3 Configuration, Overclocking, Stability Testing & Cooling"). However, **do not attempt overclocking** without adding additional cooling to the Raspberry Pi CPU and RAM.
+
+###### Heatsinks
+
+Actually, even if you don't overclock your Pi but you put it through its paces or place it in a closed case you will still benefit from adding some heatsinks to your Pi. They're not absolutely required, but I sleep better at night knowing my Pi is nice and cool. Also, I used to overclock and watercool all my computers over a decade ago so I can't help but enhance my Pi's cooling situation.
+
+While you could go a bit extreme with a [massive heatsink and fan](https://www.youtube.com/watch?v=WfQMLInuwws "Raspberry Pi 3: More Extreme Cooling"), peltier setup or even a custom watercooling contraption, it's all bound to be overkill unless you are doing some crazy hardware voltage modifications to your Pi. I'm assuming that's not you and just a simple heatsink will do.
+
+There are 2 main chips to consider for cooling: The primary SoC that's on the top of the board and the RAM chip underneath the board. There is also a smaller chip that gets a bit warm near the USB ports and that's the USB and Ethernet controller. I did a bit of research and eventually ended up getting some high-quality (albeit pricey) copper [RAM heatsinks](https://www.amazon.com/gp/product/B002BWXW6E/ref=as_li_ss_tl?ie=UTF8&psc=1&linkCode=ll1&tag=paulstamatiou-20&linkId=2a12faeae15ac183f4fafcda6e879b2d "ENZOTECH Memory Ramsink BMR-C1") and [MOSFET heatsinks](https://www.amazon.com/gp/product/B004CLDIHK/ref=as_li_ss_tl?ie=UTF8&psc=1&linkCode=ll1&tag=paulstamatiou-20&linkId=9eec0bca23273108f9bfc933cc94d134 "Enzotech MOS-C1 MOSFET Heatsinks - 10 Pack") from ENZOTECH. The RAM heatsinks fit well over the SoC and the RAM, but you can also do the same by just placing 4 of the tiny MOSFET heatsinks on each chip.
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09376-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09390-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09392-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09397-1500.jpg)
+
+
+
+
+
+Unfortunately, they are a bit tall so having this on the underside of your Pi can limit your case mounting options. I also got a set of [low profile aluminum heatsinks](https://www.amazon.com/gp/product/B00A88DVTG/ref=as_li_ss_tl?ie=UTF8&psc=1&linkCode=ll1&tag=paulstamatiou-20&linkId=28ccdcc3c6502e804397c1590f214fc5 "LinuxFreak brand Aluminum Heatsink set for Raspberry Pi - Set of 2 Heat Sinks") that would work as well.
+
+With my Pi now heatsink'd up I can let it handle just about any task and not worry about it overheating. You can also get a 5V fan to blow over the heatsinks and [use a script to only spin it up when it gets hot](https://medium.com/@edoardo849/how-to-control-a-fan-to-cool-the-cpu-of-your-raspberrypi-3313b6e7f92c "How to control a fan to cool the CPU of your RaspBerryPi").
+
+
+
+## 使用I/O引脚
+
+##### Pi electronics 101
+
+
+
+Time for the fun part ¡ª tinkering with some electronics and the Pi's General Purpose I/O (GPIO) pins!
+
+The Pi 3 ¡ª and Pi Zero, just without the connector pins ¡ª has 40 pins, but 26 of them are GPIO pins accessible to program. The other pins are ground (8), 5V (2), 3.3V (2) and reserved EEPROM pins (2).
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09453-1500.jpg)
+
+
+
+
+There's a lot you can do with these pins. You can program any GPIO pin on or off (high or low) which provides 3.3V at a max current of around 50mA[5](#footnote-5). You can also have your program accept input on these pins, such as when connecting a sensor, switch or other device.
+
+However, 50mA off a 3.3V pin is really not a lot to power anything. If you need a bit more juice, there's the two 5V pins which come straight from the same power source powering the Pi, minus however much current the Pi consumes. So with a 2.5A power supply, subtract about 1A for general Pi use with some small devices plugged in and you can probably consume a max of 1.5A off those pins.
+
+Instead, the GPIO pin output should just be used as a signal to switch something on but not power it. If you're going to need to power anything more than an LED, you're going to want to use an external power source and something like a transistor, [FET](http://www.robertcudmore.org/blog/?p=181 "Switching 12vdc on and off with a Raspberry Pi") or relay. There are tons of [electronics and Raspberry Pi guides](http://elinux.org/RPi_GPIO_Interface_Circuits "Raspberry Pi GPIO interface circuits") that cover all of this in detail, but now that you have a primer let's build something simple!
+
+#### Parts
+
+To make things easier for tinkering I purchased a [breadboard](https://www.adafruit.com/products/239 "solderless breadboard") so I could prototype little circuits without soldering. I also purchased a [T-Cobbler](https://www.adafruit.com/products/2028 "Pi T-Cobbler Plus - GPIO Breakout") that directly connects the Raspberry Pi pins to the breadboard. And of course various LEDs, wires, transistors, resistors and buttons to play with.
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09490-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09486-1500.jpg)
+
+
+
+
+#### Setup & powering an LED
+
+First, attach the T-Cobbler board to the breadboard and then to the Pi with the ribbon cable. Make sure the ribbon cable's white wire connects to the corner side of the Pi. Aside from making it easy to physically connect things with the breadboard, the T-Cobbler board also provides you with the GPIO pin labels.
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09497-1500.jpg)
+
+
+
+
+It's important to note how the breadboard works so you don't short anything out by accident. On the breadboard, the outer rails (marked by blue and red lines) run the length of the board. These are the power rails where you can connect GND to the **-** blue line with a jumper cable and 5V to the **+** red line (or even an external power source instead of from the Pi) and be able to access those lines throughout the length of the board.
+
+The lines between those outer rails ¡ª the terminal strips ¡ª run perpendicular. So if you want to plug into GPIO pin #17 for example, you would put a jumper cable anywhere to the left of that pin, but not the blue or red outer rails. You can read more about [breadboards in this article](https://learn.sparkfun.com/tutorials/how-to-use-a-breadboard "How to Use a Breadboard").
+
+Now let's just try to power an LED. Nothing special for now, we'll just connect the LED to the 5V source that comes directly from the power adapter connected to the Pi.
+
+But first we should be cautious about how much voltage and current we supply to the LED; we don't want to burn it out. In my case I got a [25mA 3.4V UV purple LED](https://www.adafruit.com/product/1793 "UV LED"). I used [this LED resistor calculator](http://ledcalc.com/) to find out exactly what I needed. It said 68?, but I could only find a larger 220? resistor at the time. No biggie, it just won't be as bright.
+
+Resistors don't have polarity so it doesn't matter which way they are used. LEDs do however. There's the cathode (**-**) that is easy to identify as it's the shorter wire from the LED and there is a flat side of the LED identifying the cathode. The longer wire coming from the LED is the anode (**+**). This can be done various ways with the breadboard, but I started by placing a jumper from GND to the blue power rail and then plugging the LED into the power rail, making sure that the short end of the LED (cathode) stayed on the negative blue rail. Then I connected the resistor from the 5V terminal strip on the breadboard that comes from the Pi to the positive red power rail. Now our LED will light up!
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09516-1500.jpg)
+
+
+
+
+#### Control the LED
+
+Okay now that we've got the basics out of the way, let's actually control the LED instead of having it on continuously. To do this we need to power it from a GPIO pin instead of the 5V line. Since the GPIO pin is 3.3V and our LED has about the same forward voltage, we can ditch the resistor now. As a word of caution, we can get away with powering the LED from the GPIO pin since it won't draw much current. Don't try to power anything other than an LED off a GPIO pin or it might damage your Pi.
+
+This time we just need to connect the LED anode (the long wire) to GPIO pin 17 and the cathode to GND. You can see how I used jumpers to get this done in the photo below. As for why I used GPIO pin 17: no reason, you can pick any GPIO pin and program it below.
+
+Then we SSH into our Pi ¡ª or directly on it if you've got a display hooked up or are using VNC ¡ª and open up the python interpreter so we can use the python GPIO library. First we need import that library. It should be included in your OS automatically if you're using Ubuntu MATE or Raspbian Jessie.
+
+```
+stammy@rpi:~$ python
+Python 2.7.11+ (default, Apr 17 2016, 14:00:29)
+[GCC 5.3.1 20160413] on linux2
+Type "help", "copyright", "credits" or "license" for more information.
+>>> import RPi.GPIO as GPIO
+>>> GPIO.setmode(GPIO.BCM)
+>>> GPIO.setup(17, GPIO.OUT)
+>>> GPIO.output(17, True)
+>>> GPIO.output(17, False)
+```
+
+Then we need to set the pin mode. This determines what we mean when we provide a GPIO pin number: either by the physical pin position (`GPIO.BOARD`), or by the Broadcom pin number (`GPIO.BCM`). The latter is listed on the T-Cobbler for us so I provided the `GPIO.BCM` mode. Then we setup() each GPIO pin to be used and tell the Pi if it will be used for input or output (`GPIO.IN` or `GPIO.OUT`). We'll go with output since we just want to power the LED instead of listen for an input signal.
+
+And now we can finally **we can turn the LED on and off** with these commands: `GPIO.output(17, True)` and `GPIO.output(17, False)`. Go ahead and try it a few times! and think about how we're talking to a tiny computer over a network and having it control our electronics for us. Pretty neat. Despite everything being an app or connected device these days it's still fun to be able to control something simple like this.
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09519-1500.jpg)
+
+
+
+
+This [python GPIO library](https://sourceforge.net/p/raspberry-gpio-python/wiki/BasicUsage/) provides the essential foundation for interfacing with the i/o pins on your Pi. There is also a popular C library called [WiringPi](http://wiringpi.com/pins/) and one called [gpiozero](https://github.com/RPi-Distro/python-gpiozero), which provides higher-level functionality for quickly setting up common output and input components like LEDs, buttons, sensors and more. Definitely worth checking out for more advanced items.
+
+And while I just wanted to show the basics using the GPIO library in the python interpreter, you can of course write them as a .python file and run it as you please. There's a whole lot more you can do when actually writing your python programs.
+
+#### Using a transistor
+
+Remember how I mentioned that we don't really want to drive more than a simple LED directly off the GPIO pins? Since the GPIO pins can't provide much voltage or current [6](#footnote-6) you should find other means to trigger and power your connected components.
+
+One way to control your components is with a transistor. I'm using a basic BJT (bipolar junction transistor) NPN (negative-positive-negative) transistor, the PN2222. It can handle up to a peak of 40 Volts(!) at 1A, more than enough to drive something like a small motor, various lights, and so on. **They only need a tiny amount of current to flip on, thus saving your GPIO pins from doing the heavy lifting.**
+
+Typically you should also use a resistor between the transistor and the GPIO pin to reduce the current it draws. There is a bit more to it ¡ª like how the amount of current applied to the base can vary the collector current as the transistor acts as a simple amplifier, up to a "saturation" point ¡ª but that's best left for [some extra reading if you're curious](https://learn.sparkfun.com/tutorials/transistors/all "Transistors").
+
+
+
+
+Okay so we have 3 pins on our transistor: emitter, base, collector (EBC). It's important to note the exact pin layout as it varies by transistor. If you're looking at the flat side of the PN2222 NPN transistor, we have EBC from left to right. The middle base pin is what actually causes the transistor to trigger, making the normally open emitter and collector closed. This is the opposite behavior from a PNP transistor, with [some extra nuances](http://www.learningaboutelectronics.com/Articles/Difference-between-a-NPN-and-a-PNP-transistor "Difference Between an NPN and a PNP Transistor").
+
+
+
+While this circuit works for this very simple use, technically you would want to add a pull-down resistor from the base to ground to get it to switch off faster [and for other reasons](http://electronics.stackexchange.com/questions/56010/why-pull-base-of-bjt-switch).
+
+
+We want to connect the base pin to our GPIO pin 17 along with a resistor. I wanted to use larger resistor but only had my same 220? resistors, so I put a few in series. This introduces enough resistance to lower the current used by the transistor on the GPIO pin but have enough juice to saturate the transistor into a fully on state.
+
+Current flows from the collector to the emitter, so I connected a jumper cable from GND to the emitter and then placed the LED[7](#footnote-7) between the Pi's 5V pin and the collector. Then connect the LED anode to the 5V line and the cathode (the flat side of the LED) to the transistor's collector with that same 220? resistor in between.
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09530-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09525-1500.jpg)
+
+I forgot to add a resistor from in front of the LED, so it was powered by 5V directly here. No issues but you don't want to be running it like this for more than a bit.
+
+Now you can run the same `GPIO.output()` commands from earlier to switch the LED on and off, via the NPN transistor! This is a safer way to control small devices off your GPIO pins.
+
+But what should you do if you need something even heavier duty for controlling a larger motor, high-wattage LED or anything demanding more than an amp of continuous current? You can use a beefier transistor like a heavy duty BJT, darlington or a MOSFET[8](#footnote-8). There are various differences between them that I'm not experienced enough to comment on. But I do know the about the next option: relays!
+
+#### Use a relay
+
+Lets get a relay working for our last experiment. There are several kinds from electromechanical to solid state and more, but the concept is the same: it will close or open any circuit you connect it to when triggered. The difference between our earlier BJT NPN transistor is that it's only on or off (no amplification characteristics) and the circuit you switch on is entirely separate to the switching logic (at least with electromechanical relays) so most relays can support much larger voltages while not interacting with your low voltage circuit. Okay I'm rambling a bit but basically this is what you want if you want to run an even larger device, like a 120V lamp for example (assuming your relay is rated for that).
+
+I ended up getting an electromechanical relay that was [preassembled to just need a control signal](https://www.amazon.com/gp/product/B00VRUAHLE/ref=as_li_ss_tl?ie=UTF8&psc=1&linkCode=ll1&tag=paulstamatiou-20&linkId=097665a19e82d17d0bcccfd7229da356) to make it easy to get started ¡ª it wires up just like our previous transistor. It bakes in the necessary transistor and some safeguards like a flyback spike protection diode for the relay coil. Again, there's [more to read about on the EE side](http://electronics.stackexchange.com/questions/100134/why-is-there-a-diode-connected-in-parallel-to-a-relay-coil "Why is there a diode connected in parallel to a relay coil?") here if you're interested.
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09540-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09544-1500.jpg)
+
+
+
+
+#### What's next?
+
+My intent with this section was to fall somewhere between informative overview and step-by-step. There's obviously a ton of detail I flew past and I only showed the most basic examples. But now you get to have fun trying out new circuits on your own! I love this stuff as you can probably tell.. I took a few EE classes in college long ago.
+
+While we only touched on using a single GPIO pin as output, there's lots to learn about using them for input as well. Here's some more reading about Raspberry Pi GPIO electronics to keep you busy:
+
+* Learn more about [circuits at Khan Academy](https://www.khanacademy.org/science/electrical-engineering/ee-circuit-analysis-topic)
+** [Connect a button](http://razzpisampler.oreilly.com/ch07.html "Connecting a Push Switch with Raspberry Pi") and use it as an input
+* [Control a servo motor](http://razzpisampler.oreilly.com/ch05.html#SEC7.12)
+* Hook up a [motion sensor as an input](https://www.raspberrypi.org/learning/parent-detector/worksheet/) and record a video when motion is detected.
+* Use an iOS app and Pi server like [Cayenne](http://www.cayenne-mydevices.com/ "Easy IoT for Raspberry Pi") or [MyPi](https://itunes.apple.com/us/app/mypi-control-your-raspberry/id1098156642?mt=8 "MyPi - Control your Raspberry Pi GPIO") to control your connected relays, sensors and other GPIO devices on your phone or the web. Or do it yourself with the self-hosted [Pi GPIO web interface](https://github.com/stuart-thackray/pi_gpio_web/ "Raspberry Pi GPIO Web Interface")
+
+
+
+## 做一个数码相框
+
+##### with a 10" 1920x1200 display
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00194-1500.jpg)
+
+
+
+
+I wanted to do a real project with this Raspberry Pi. Something I would actually use around the house. I had narrowed it down to either a smart mirror or a digital photo frame.
+
+While there is definitely a strong cool factor with smart mirrors, I soon realized just would not get real value out of it. There's not much information I care about enough to be on an ambient display that I can't interact with.
+
+Weather, news headlines, email subject lines? The majority of those require me to pull out my phone to actually learn more. And I wouldn't put anything too personal on there that guests might see (like work emails). There might be something there in the future when it's easy to setup eye tracking and have the device detect when my eye is dwelling on and open a browser to learn more about that news headline, or read it out loud to me. But for now I already check my phone a million times per day and I already get my news read to me with the Amazon Echo every morning.
+
+A digital photo frame on the other hand has an aesthetic value for me and given that I'm a [photographer](https://paulstamatiou.com/photos) as well, it's right up my alley. It will fit in nicely with the other framed photos and travel books throughout my house.
+
+#### We're going to need a display
+
+Now the obvious question ¡ª where was I going to put this and what kind of display would I need? There's also the question of whether I wanted a touchscreen display. I decided to keep it simple, no touchscreen. Though I did initially think it would be neat to swipe between photos. Going with a regular display also makes mounting the display easier as I could put it behind glass.
+
+As I began searching for displays I quickly realized that it was hard to find a high-resolution display. The majority seemed to be somewhere between [800x480](https://www.adafruit.com/products/2718) (like the official Raspberry Pi 7-inch touchscreen display) and 1280x800. I even found some folks using iPad 2 displays (9.7-inch 1024x768). Not bad but not terribly appealing to me. A low resolution screen would not do my photos justice, especially as I have gotten used to seeing them on high PPI displays.
+
+I also knew I would be mounting this in a frame and put it on my bookshelf. While disassembling a regular desktop computer monitor is a popular route, I wanted something smaller and didn't want anything that required a bulky 120V cable and adapter. I needed a display that could be powered via a USB cable and accept an HDMI input.
+
+The maximum resolution the Pi can support is 1920x1200 so I wanted to get as close as possible in something around the 10" to 12" size. After lots of searching, I eventually stumbled across [this 10-inch beauty from Chalk Elec](http://www.chalk-elec.com/?page_id=1280#!/10-FullHD+-LCD-with-HDMI-interface/p/41737268/category=3094859 "10-inch FullHD+ LCD with HDMI interface"). Yes, it's expensive at $140 USD for just the panel, but it was just what I was looking for. It's 1920x1200 and at 10 inches this makes it a **ridiculous 226ppi panel**. That's a tad better than the 218ppi 5K iMac.
+
+I tested the display and everything was working great! I was just amazed at the resolution. It would have been overkill if I was going to use this for anything other than a photo frame. I would have had to use a lower resolution setting to make it comfortable to use for longer periods.
+
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09125-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09150-1500.jpg)
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09147-1500.jpg)
+
+
+
+
+The display's controller board can accept power from either a dedicated wall adapter or from USB. I wanted to go with USB so I followed their instructions and soldered a jumper (0¦¸ resistor) where it says R12 to be able to accept USB power. And while I was tinkering with it I also decided to put heatsinks on the two LVDS chips as they got very hot when in use.
+
+I was a bit nervous working with this board. The cable running to the display extremely short and delicate, whereas the huge HDMI cable is rather inflexible and heavy. I was quick to use some electrical tape to prevent the flat cable from getting plucked out and torn.
+
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-raspberry-pi3-DSC09379-1500.jpg)
+
+
+
+
+#### Mounting the display
+
+Now that I had my functioning display, I needed to figure out how to mount it. I decided I wanted to go with a real photo frame with matting. I planned on putting it on my bookshelf. The original thought of hanging it on my wall was intriguing but hiding the cables would be tricky and I didn't feel like drilling into my wall and installing an up-to-code flush electrical outlet behind the mounting location.
+
+I went to a local frame store and **picked up a 9" x 12" frame** with matting. I carefully measured the viewport of the display and cut the sides of the matting to add room. This worked but the matting had a beveled cut so my cuts don't look natural . Eventually I will have the frame store custom cut the matting in the size I need, but this will do for now.
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC09815-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC09821-1500.jpg)
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC09826-1500.jpg)
+
+
+
+
+I then placed the display behind the matting. Fortunately, the bezel of the display came with a sticky foam so I just peeled that back and placed it on the matting. After the display was in place, I ran a thin line of hot glue around the sides.
+
+I cut peices of cardboard and put them alongside the display to reduce pressure on the display when I put the back of the frame on. I then cut a space for the cables and the controller board to peek through as well. I'm surprised how easily this part all came together. It actually looks pretty decent! Well, everything except for the hacky way I cut the frame backboard.. in hindsight I could have probably just cut a hole for the display's cable only instead of having it be against the display. I was just a bit concerned with moving that cable as it's pretty short and very fragile.
+
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC09832-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC09840-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC09841-1500.jpg)
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00010-1500.jpg)
+
+
+
+
+#### Mounting the Pi
+
+With the back of the frame in place, I still had a bit of room to tuck the Pi away out of sight. The HDMI cable was pretty thick so that pretty much held the Pi in place on its own. I hot glued the cables in place to be safe. Then I added some small 3M plastic hooks to keep the Pi in place ¡ª one hooks into the Ethernet port and the other I attached with a small wire to the corner mounting hole.
+
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00040-1500.jpg)
+
+
+
+
+Now it was time to give it a test boot. I was powering the Pi and display off separate USB sources so that I had more than enough juice to spare for the Pi. Instead of having to find two USB power adapters, I opted for [this great Anker dual USB adapter](https://www.amazon.com/gp/product/B012WMWPJW/ref=as_li_ss_tl?ie=UTF8&psc=1&linkCode=ll1&tag=paulstamatiou-20&linkId=cdcfeeb2de9fca9e548e87beea54255e "Anker 24W Dual USB Wall Charger PowerPort 2") which provides 2.4 amps per USB port.
+
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00051-1500.jpg)
+
+
+
+
+With the Pi Frame now up and running **I had two challenges**:
+
+* Figure out how to display my photos in fullscreen
+* Figure out how to turn the display and backlight off when needed.
+
+#### Displaying the photos
+
+The traditional route for a digital photo frame is to have a synced photos folder with something like Dropbox or rsync and then use a fullscreen image viewer like `feh` or `fbi`. Both are rather no-frills setups.
+
+I wanted to see if I could bypass the photo syncing portion of this and **just use the Google Photos website**. I've already [professed my love for Google Photos](https://paulstamatiou.com/storage-for-photographers-part-2/) so it would be great to use it here as well.
+
+I would only need to have a browser that could display in fullscreen or kiosk mode. I decided to give Firefox a try, only because there is currently an issue with the latest Chromium crashing on Ubuntu MATE. There are also dedicated kiosk browsers like [kweb](https://www.raspberrypi.org/forums/viewtopic.php?t=40860) and [FullPageOS](https://github.com/guysoft/FullPageOS).
+
+Fortunately, **Firefox's native fullscreen mode did the trick**. I just had to log into Google Photos, select an album and hide the mouse in the corner. To get photos to be the perfect aspect ratio to fill the 1920x1200 display and not have pillar or letterboxing, I created a new album and **uploaded some of my travel photos cropped to a 16:10 aspect ratio**. It worked perfectly!
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00108-1500.jpg)
+
+
+
+
+I was able navigate between photos in the selected album using the left and right arrow keys. Google Photos also recently created a slideshow mode that I tested out. However, there were two issues with the slideshow mode to address. The first is that it puts a control box in the bottom left corner that does not hide. I ended up using the Stylish Firefox plugin to inject CSS to hide that box.
+
+The other issue with slideshow mode is that it's a bit too fast for this use case. It seems to progress to the next photo every ~5 seconds. I'd much prefer something every few minutes or hours. I ended up not using slideshow mode. I can just hit the right arrow on the keyboard to get the next photo or video.
+
+I decided to take a stab at automate this by writing and **injecting some JavaScript into the page to change the photos for me every 10 minutes**. I used the Firefox Greasemonkey add-on and wrote this script below. Unfortunately, when you get to the last photo in an album in Google Photos, the next arrow disappears and it does not loop you back to the beginning. So I had to have the script to detect when you got to the end and then go backwards until it hits the first photo and so on.
+
+You can adjust the time by changing the `600000` number (10 minutes in milliseconds) on the last line.
+
+```
+// ==UserScript==
+// @name google photos slower slideshow
+// @namespace piframe
+// @include https://photos.google.com/album/*
+// @version 1
+// @grant none
+// ==/UserScript==
+
+function next_or_prev() {
+ window.direction = window.direction || 'forward';
+ var next_el = document.getElementsByClassName("oJhm5 gMFQN")[0];
+ var prev_el = document.getElementsByClassName("oJhm5 KUdGif")[0];
+
+ if (direction == 'forward') {
+ var css_display = window.getComputedStyle(next_el).getPropertyValue('display');
+ if (css_display == 'block') {
+ next_el.click();
+ } else if (css_display == 'none') {
+ window.direction = 'backward';
+ next_or_prev();
+ }
+ } else if (direction == 'backward') {
+ var css_display = window.getComputedStyle(prev_el).getPropertyValue('display');
+ if (css_display == 'block') {
+ prev_el.click();
+ } else if (css_display == 'none') {
+ window.direction = 'forward';
+ next_or_prev();
+ }
+ }
+}
+window.setInterval(function(){next_or_prev()}, 600000);
+
+```
+
+**Update:** The issue with this approach is that it relies on the class names of the left and right arrows, which are bound to change with future Google Photos web deploys. **I rewrote this script (below)** to trigger a right arrow key event instead. It keeps trying to go to the next photo and if the URL doesn't change it figures it must be at the last photo so it goes to the first photo. This **requires you to provide the URL of the first photo** in the album you are using.
+
+```
+// ==UserScript==
+// @name google photos slower slideshow
+// @namespace piframe
+// @include https://photos.google.com/album/*
+// @version 1
+// @grant none
+// ==/UserScript==
+
+// CHANGE first_photo TO USE THE URL OF THE FIRST PHOTO IN YOUR ALBUM
+var first_photo = 'https://photos.google.com/album/XXXX/photo/XXXX';
+function pressKey() {
+ var key = 39; // right arrow keycode
+ var body = document.getElementsByTagName('body')[0];
+ if(document.createEventObject) {
+ var eventObj = document.createEventObject();
+ eventObj.keyCode = key;
+ body.fireEvent("onkeydown", eventObj);
+ } else if (document.createEvent) {
+ var eventObj = document.createEvent("Events");
+ eventObj.initEvent("keydown", true, true);
+ eventObj.which = key;
+ body.dispatchEvent(eventObj);
+ }
+}
+function next_or_prev() {
+ var current_url = window.location.href;
+ pressKey();
+ if (current_url == window.location.href) {
+ // page didnt change, must be at last photo
+ // load the first photo
+ window.location.href = first_photo;
+ }
+}
+window.setInterval(function(){next_or_prev()}, 600000);
+
+```
+
+Now we're in business! With the photo display stuff figured out, I put the Pi Frame in its new home on my bookshelf. I nestled it in between my travel book collection, hooked up an extension cord to the Anker power adapter and hid the cable.
+
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00129-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00138-1500.jpg)
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00151-1500.jpg)
+
+
+
+
+#### Turning the display off
+
+With that out of the way, the next objective was to figure out how to turn off the display. There were two routes to explore. I could cut up the USB power cable going to the display and place a relay on it so the Pi's GPIO pin could programatically turn the power on and off. The other route would be to figure out how to programatically tell the display controller that there was no signal so it would turn off the backlight automatically. I explored the latter route.
+
+I was unable to get the display and backlight to completely shutoff by simply setting the OS setting for display inactivity. It would black out the screen but the display controller would still think there was a video signal and keep the backlight on. What I needed to do was turn off the HDMI port entirely.
+
+I wrote two bash scripts. The first, `display_off.sh`, simply ran this command: `tvservice -o`. I saved that file and made sure to `chmod +x` it to make it executable. I tested it out and it correctly turned off the display and backlight!
+
+The next script to turn the display back on was a bit trickier. This is what I put in my `display_on.sh` script:
+
+```
+tvservice -p
+chvt 9 && chvt 7
+xrefresh -d :0
+```
+
+I tried a lot of stuff before I landed on something that worked. The chvt commands require sudo, so this script must always be run with sudo. I added the bash script to the sudoers file so that it at least doesn't ask for a password. I ran `sudo visudo` and added this line to the end (replace stammy with your Pi's username):
+
+```
+stammy ALL=NOPASSWD: /home/stammy/display_on.sh
+```
+
+#### Wiring up a button and fan
+
+With these two scripts I could SSH into the Pi and turn the display on and off. But SSHing into the Pi each time I wanted to toggle the display was going to be annoying. I decided to wire up a physical button to run this script for me. I soldered up a push button, hot glued it to the top right corner of the frame for easy access and attached it to GPIO pin 17 and GND.
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00163-1500.jpg)
+
+
+
+
+But... if I was going to be doing a bit of soldering I figured I would also **wire up a relay to control a fan to cool down the display controller board** when running. It's probably not entirely necessary, but even with the copper heatsinks the display controller still gets very hot.
+
+I rebuilt the relay circuit I talked about earlier in this article to control a small 5V 50mm fan powered off the Pi's 5V power source. Except this time I soldered everything (rather hackily at that) instead of using a breadboard. I wanted to use a transistor for this instead of a relay but I didn't have a 1N4001 flyback diode on hand to prevent inductive kickback when the fan shuts off.
+
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00168-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00172-1500.jpg)
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00183-1500.jpg)
+
+
+
+
+With everything back in place, I put the Pi Frame back on the bookshelf. I just had to **write a python script to detect the button press** then appropriately trigger the display on or off bash script and trigger the fan relay.
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00201-1500.jpg)
+
+
+
+
+With the button hooked up to GPIO pin 17 I setup that pin to listen for input. One thing to note is that it is configured with the Pi's internal pull up resistor so I didn't have to use a resistor along with the button when wiring it up. This resistor configuration makes it easier for the Pi to distinguish if the button is being pushed by making it so that the pin's input voltage doesn't float (as it would with stray capacitance) when not connected to anything.
+
+Since I just have the push button connected to GND and a GPIO pin, I use a pull-up resistor so the GPIO pin will read high normally and read low when the button is pushed. This is a bit opposite of what you may expect, thus why the script below only acts when the input is detected as false. The other way to do this is to wire the push button to a 3.3V line and a GPIO pin with a pull-down resistor. That will get the GPIO input to be pulled down to low by default and go high when the button is pressed and the circuit closes.
+
+```
+import RPi.GPIO as GPIO
+import time
+import subprocess
+
+GPIO.setmode(GPIO.BCM)
+GPIO.setup(17, GPIO.IN, pull_up_down=GPIO.PUD_UP)
+GPIO.setup(27, GPIO.OUT, initial=GPIO.LOW)
+display_on = True
+
+while True:
+ btn = GPIO.input(17)
+ if btn == False:
+ print('Button press registered')
+ if display_on == True:
+ display_on = False
+ GPIO.output(27, False)
+ subprocess.call("/home/stammy/display_off.sh", shell=True)
+ time.sleep(0.5)
+ elif display_on == False:
+ display_on = True
+ GPIO.output(27, True)
+ subprocess.call("/home/stammy/display_on.sh", shell=True)
+ time.sleep(0.5)
+```
+
+I connected the relay's signal line to GPIO pin 27 so I just needed to set that as output. Now I just listen for the button input and toggle between turning the display on and off. To actually run the aforementioned bash scripts I use the `subprocess.call()` lines. The `time.sleep()` lines are added in there so that holding the button a bit too long won't run the scripts multiple times.
+
+I saved the script as a python file and ran it in a terminal on the Pi. Again, I need sudo here for the chvt commands mentioned above:
+
+
sudo python display_button.python
+```
+
+To make it easier whenever you reboot the Pi, you can add this as a custom application launcher in the top panel. Set it to Type: "Application in Terminal."
+
+
+
+
+
+
+
+Time to actually enjoy the Pi Frame!
+
+#### What's next with the Pi Frame?
+
+I'm pretty impressed with the result of this little frame. Impressive, crisp image quality with a ridiculously easy Google Photos "integration"... love it. However, there are a few more things I'd like to explore with this project in the future:
+
+* Get another matting sheet professionally cut to my exact spec since I was unable to replicate the same beveled cut it came with.
+* Use a motion sensor to turn off the display after no movement has been detected in the room for a while.
+* Replace Firefox with Chromium when the issues are addressed and see if I can get hardware accelerated videos working so that my videos on Google Photos can play well (they play now but frames drop). Detect proximity and if someone is actively looking at the display while a video is playing, enable audio to play through a small speaker.
+* Setup an Amazon Echo Alexa skill so that I can tell the Pi to change to various Google Photos albums on command, or turn on/off the display.
+* Connect a Leap Motion to be able to gesture between photos without needing to touch anything.
+* Be able to use the display for various other tasks by hosting simple local webpages and changing the browser tab on command. For example one page could be a digital clock or analog clock with an interesting watch face.
+
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00203-1500.jpg)
+
+
+
+[](https://turbo.paulstamatiou.com/uploads/2016/06/pstam-rpi-photo-frame-DSC00211-1500.jpg)
+
+
+
+
+Thanks for reading! If you've enjoyed this post I only ask that you please share it.
+
+
+[1](#r1) 只能运行较老的内核,并且软件支持更糟糕。
+
+[2](#r2) 例如,关闭不用的端口,板载LED,减少外设以及过度的软件消耗CPU周期。[这里是一篇文章](http://www.jeffgeerling.com/blogs/jeff-geerling/raspberry-pi-zero-conserve-energy "Raspberry Pi Zero - Conserve power and reduce draw to 80mA"),它谈到了对Pi Zero做这些操作,以减少空转,从而逼近仅仅80mA。
+
+[3](#r3) There is actually [a way to go through the initial setup via USB](http://blog.gbaman.info/?p=791 "Raspberry Pi Zero - Programming over USB") without a display at all, but it's a bit more complicated than I'd like to explain in this post. You can also preconfigure and build an image that has SSH ready to go so you can do the entire setup via SSH on boot, or if you use Raspbian it has SSH enabled by default. You'll need to connect via Ethernet instead of Wi-Fi at first and [then follow these steps](http://raspberrypi.stackexchange.com/a/27352 "How to set up Raspberry Pi without a monitor?").
+
+[4](#r4) This actually installs an older version of netatalk, which still works but if you must always have the latest version of everything, search for how to setup netatalk 3.0+. You'll have to build it yourself, but it lets Spotlight index and search your Pi as well.
+
+[5](#r5) Though I think that is just 50mA across _all_ GPIO pins combined. Not very much.
+
+[6](#r6) Especially if you will be using more than one GPIO pin since they seem to all share the 50mA maximum current available on the 3.3V line; though there is a chance this number is a bit higher for the Pi 3, it was very hard to actual find this listed anywhere official.
+
+What happens if you draw too much current off the GPIO pins when in output mode? Most likely it will just cause the Raspberry Pi to reboot. In more extreme scenarios of drawing lots of power (including various USB devices) you could blow the Raspberry Pi's built-in polyfuse which could take anywhere from a few minutes to hours to get back to normal.
+
+[7](#r7) Yeah, I'm still using the LED here to test it out but you can place a larger device here. Up to about ~500mA safely if you like, you just may want to power it from something other than the Pi's 5V line.
+
+[8](#r8) You'll have to do some searching for one that works at 3.3V, most require a bit more.. or just use a transistor to switch on that MOSFET with a larger power source.
\ No newline at end of file
diff --git a/raw/Page dewarping.md b/raw/Page dewarping.md
new file mode 100644
index 0000000..4874b4f
--- /dev/null
+++ b/raw/Page dewarping.md
@@ -0,0 +1,255 @@
+原文:[Page dewarping](https://mzucker.github.io/2016/08/15/page-dewarping.html)
+
+---
+
+
+扁平化卷曲页图像,作为一个优化问题。
+
+# 概述
+
+前阵子,我写了一个脚本来根据手写文本图片创建PDF。这没啥特别的 —— 只是[自适应阈值](http://docs.opencv.org/3.0-last-rst/modules/imgproc/doc/miscellaneous_transformations.html#cv2.adaptiveThreshold),然后将多个图像合并成一个PDF —— 但每当有学生给我发了一堆JPEG作为他们的作业的时候,这就派上了用场。在我向我的未婚妻演示了这个程序后,她最后让我偶尔在她用于语言学研究的归档文档上运行它。这个夏天,她从图书馆带回来了大量的图片,其中,由于卷曲页,文本明显的扭曲。
+
+因此,我决定写个程序_自动地_将诸如下面左边的图片转换成右边的图片:
+
+
+
+正如这个博客中的每个项目,代码[在github上](https://github.com/mzucker/page_dewarp)。如果你想要先看看更多之前之后的图片,那么随你跳到结果部分。
+
+# 背景
+
+我绝对不是第一个想出文档图像扭曲矫正办法的人 —— 甚至在Dan Bloomberg的开源图像处理库[Leptonica](http://www.leptonica.com/dewarping.html)中就有对其实现 —— 但当涉及到了解一个问题时,没有什么比自己实现更好的了。除了通过浏览Leptonica代码,我还扫了关于这个主题的几篇论文,包括一个扭曲矫正比赛结果的[综述](http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.99.7439),以及关于竞赛获奖的坐标转换模型(CTM)方法的[文章](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.552.8971)。
+
+Leptonica的扭曲矫正方法和CTM方法都有相似的分级问题分解:
+
+ 1. 拆分文本至行。
+
+ 2. 查找扭曲或者坐标转换,从而使行平行或水平。
+
+对我来说,较之CTM的3D “cylinder”模型,Leptonica对于第二个子问题的解决方法似乎有点特别。老实说,在破译CTM论文的过程中,我遇到了点麻烦,但我喜欢基于模型的方法这个想法。因此,我决定创造自己的参数模型,其中,页面的外观由多个参数确定:
+
+ * 一个旋转向量$r$,以及一个平移向量$t$,它们都在${\Bbb {R}}^3$中,其参数化页面的3D取向和位置。
+
+ * 两个斜率
+$\alpha$和$\beta$,指定页面表面的曲率(参见下面的曲线图)
+
+ * 页面上$n$水平跨度的垂直偏移$y_1,...,y_n$
+
+ * 对于每个跨度$i\in\{1,...,n\}$,$m_i$的水平偏移$x_i^{(1)},...,x_i^{(m_i)}$指向水平跨度(所有都位于垂直偏移$y_i$)
+
+该页面的3D形状来源于沿着局部$y$轴横扫曲线(从顶至底方向)。页面上的每个$x$ (从左到右)坐标映射到页面表面的位移$z$。我将页面表面的水平截面建模成一个三次样条,其端点固定在零点。样条曲线的形状可以完全由其在端点$\alpha$和$\beta$的斜率指定。
+
+
+
+如图所示,修改斜率参数,获得各种各样的“类页”曲线。下面,我已经生成了一个动画,它修复了页面尺寸和和所有的$(x,y)$坐标,同时改变位姿/形状参数$r$,$t$,$\alpha$和$\beta$ —— 你可以开始欣赏参数空间跨越有用的各种页面外观:
+
+
+
+重要的是,一定位姿/形状参数固定了,页面上的每个$(x,y)$坐标会被投影到图像平面上一个确定的位置。有了这个丰富的模型,现在,我们可以将整个扭曲矫正拼图框架为一个优化问题:
+
+ * identify a number of _keypoints_ along horizontal text spans in the original photograph
+
+ * starting from a naïve initial guess, find the parameters , , , , , , , , , which minimize the [reprojection error](https://en.wikipedia.org/wiki/Reprojection_error) of the keypoints
+
+Here is an illustration of reprojection before and after optimization:
+
+
+
+The red points in both image are detected keypoints on text spans, and the
+blue ones are reprojections through the model. Note that the left image
+(initial guess) assumes no curvature at all, so all blue points are collinear;
+whereas the right image (final optimization output) has established the page
+pose/shape well enough to place almost all of the blue points on top of each
+corresponding red point.
+
+Once we have a good model, we can isolate the pose/shape parameters, and
+invert the resulting page-to-image mapping to dewarp the entire image. Of
+course, the devil is in the details.
+
+<<<<<<< HEAD
+# 程序
+=======
+# 过程
+>>>>>>> 415706004c54800bb7338543c0b4b30b328dc2b2
+
+Here is a rough description of the steps I took.
+
+ 1. **Obtain page boundaries.** It’s a good idea not to consider the entire image, as borders beyond the page can contain lots of garbage. Instead of [intelligently identifying page borders](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.81.1467), I opted for a simpler approach, just carving out the middle hunk of the image with fixed margins on the edges.
+
+ 2. **Detect text contours.** Next, I look for regions that look “text-like”. This is a multi-step process that involves an initial adaptive threshold:
+
+
+
+…[morphological dilation](https://en.wikipedia.org/wiki/Dilation_\(morphology\)) by a
+horizontal box to connect up horizontally adjacent mask pixels:
+
+
+
+…[erosion](https://en.wikipedia.org/wiki/Erosion_\(morphology\)) by a vertical box to eliminate single-pixel-high “blips”:
+
+
+
+and finally, [connected component analysis](https://en.wikipedia.org/wiki/Connected-component_labeling) with a filtering step to eliminate any blobs
+which are too tall (compared to their width) or too thick to be text. Each
+remaining text contour is then approximated by its best-fitting line segment
+using [PCA](https://en.wikipedia.org/wiki/Principal_component_analysis), as
+shown here:
+
+
+
+Since some of the images that my fiancée supplied were of tables full of
+vertical text, I also specialized my program to attempt to detect horizontal
+lines or rules if not enough horizontal text is found. Here’s an example image
+and detected contours:
+
+
+
+ 3. **Assemble text into spans.** Once the text contours have been identified, we need to combine all of the contours corresponding to a single horizontal span on the page. There is probably a linear-time method for accomplishing this, but I settled on a greedy quadratic method here (runtime doesn’t matter much here since nearly 100% of program time is spent in optimization anyways).
+
+Here is pseudocode illustrating the overall approach:
+
+```python
+edges = []
+
+for each contour a:
+ for each other contour b:
+ cost = get_edge_cost(a, b)
+ if cost < INFINITY:
+ edges.append( (cost, a, b) )
+
+sort edges by cost
+
+for each edge (cost, a, b) in edges:
+ if a and b are both unconnected:
+ connect a and b with edge e
+
+```
+
+Basically, we generate candidate edges for every pair of text contours, and
+score them. The resulting cost is infinite if the two contours overlap
+significantly along their lengths, if they are too far apart, or if they
+diverge too much in angle. Otherwise, the score is a linear combination of
+distance and change in angle.
+
+Once the connections are made, the contours can be easily grouped into spans;
+I also filter these to eliminate any that are too small to be useful in
+determining the page model.
+
+
+
+Above, you can see the span grouping has done a good job amalgamating the text
+contours because each line of text has its own color.
+
+ 4. **Sample spans.** Because the parametric model needs discrete keypoints, we need to generate a small number of representative points on each span. I do this by choosing one keypoint per 20 or so pixels of text contour:
+
+
+
+ 5. **Create naïve parameter estimate.** I use PCA to estimate the mean orientation of all spans; the resulting principal components are used to analytically establish the initial guess of the and coordinates, along with the pose of a flat, curvature-free page using [`cv2.solvePnP`](http://docs.opencv.org/3.0-last-rst/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#cv2.solvePnP). The reprojection of the keypoints will be accomplished by sampling the cubic spline to obtain the -offsets of the object points and calling [`cv2.projectPoints`](http://docs.opencv.org/3.0-last-rst/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#cv2.projectPoints). to project into the image plane.
+
+ 6. **Optimize!** To minimize the reprojection error, I use [`scipy.optimize.minimize`](http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html) with the `'Powell'` solver as a black-box, derivative-free optimizer. Here’s reprojection again, before and after optimization:
+
+
+
+Nearly 100% of the program runtime is spent doing this optimization. I haven’t
+really experimented much with other solvers, or with using a specialized
+solver for [nonlinear least squares](https://en.wikipedia.org/wiki/Non-
+linear_least_squares) problems (which is exactly what this is, by the way). It
+might be possible to speed up the optimization a lot!
+
+ 7. **Remap image and threshold.** Once the optimization completes, I isolate the pose/shape parameters , , , and to establish a coordinate transformation. The actual dewarp is obtained by projecting a dense mesh of 3D page points via [`cv2.projectPoints`](http://docs.opencv.org/3.0-last-rst/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#cv2.projectPoints) and supplying the resulting image coordinates to [`cv2.remap`](http://docs.opencv.org/3.0-last-rst/modules/imgproc/doc/geometric_transformations.html#cv2.remap). I get the final output with [`cv2.adaptiveThreshold`](http://docs.opencv.org/3.0-last-rst/modules/imgproc/doc/miscellaneous_transformations.html#cv2.adaptiveThreshold) and save it as a bi-level PNG using [Pillow](http://python-pillow.org/). Again, before and after shots:
+
+
+
+# 结果
+
+I’ve included several [example images](https://github.com/mzucker/page_dewarp/tree/master/example_input) in
+the github repository to illustrate how the program works on a variety of
+inputs. Here are the images, along with the program output:
+
+**boston_cooking_a.jpg**:
+
+
+
+**boston_cooking_b.jpg**:
+
+
+
+**linguistics_thesis_a.jpg**:
+
+
+
+**linguistics_thesis_b.jpg**:
+
+
+
+I also compiled some statistics about each program run (take the runtimes with
+a grain of salt, this is for a single run on my 2012 MacBook Pro):
+
+Input | Spans | Keypoints | Parameters | Opt. time (s) | Total time (s)
+---|---|---|---|---|---
+boston_cooking_a.jpg | 38 | 554 | 600 | 23.3 | 24.8
+boston_cooking_b.jpg | 38 | 475 | 521 | 18.0 | 18.8
+linguistics_thesis_a.jpg | 20 | 161 | 189 | 5.1 | 6.1
+linguistics_thesis_b.jpg | 7 | 89 | 104 | 4.2 | 5.3
+
+You can see these are not exactly _small_ optimization problems. The smallest
+one has 89 parameters in the model, and the largest has 600. Still, I’m sure
+the optimization speed could be improved by trying out different methods
+and/or using a compiled language.
+
+# 总结
+
+The way this project unfolded represents a fairly typical workflow for me
+these days: do a bit of reading to collect background knowledge, and then
+figure out how to formulate the entire problem as the output of some
+optimization process. I find it’s a pretty effective way of tackling a large
+number of technical problems. Although I didn’t think of it at the time, the
+overall approach I took here is reminiscent of both [deformable part models](https://people.eecs.berkeley.edu/~rbg/latent/) and [active appearance models](https://www.cs.cmu.edu/~efros/courses/AP06/Papers/matthews_ijcv_2004.pdf), though not as sophisticated as either.
+
+Both Leptonica and the CTM method go one step further than I did, and try to
+model/repair horizontal distortion as well as vertical. That would be useful
+for my code, too – because the cubic spline is not an [arc-length](https://en.wikipedia.org/wiki/Arc_length) parameterization, the text
+is slightly compressed in areas where the cubic spline has a large slope.
+Since this project was mostly intended as a proof-of-concept, I decided not to
+pursue the issue further.
+
+Before putting up the final code on github, I tried out using the automated
+Python style checker [Pylint](https://www.pylint.org/) for the first time. For
+some reason, on its first run it informed me that all of the `cv2` module
+members were undefined, leading to an initial rating of -6.88/10 (yes,
+negative). Putting the line
+
+```python
+
+ # pylint: disable=E1101
+
+```
+
+near the top of the file made it shut up about that. After tweaking the
+program for a while to make Pylint happier, I got the score up to 9.09/10,
+which seems good enough for now. I’m not sure I agree 100% with all of its
+default settings, but it was interesting to try it out and learn a new tool.
+
+I do all of my coding these days in [GNU Emacs](https://www.gnu.org/software/emacs/), which usually suits my needs;
+however, messing around with Pylint led me to discover a feature I had never
+used. Pylint is not fond of short variable names like `h` (but has no problem
+with `i`, go figure). If I use the normal Emacs `query-replace` function bound
+to `M-%` and try to replace `h` with `height` everywhere, I have to pay close
+attention to make sure that it doesn’t also try to replace the h other
+identifiers (like `shape`) as well. A while back, I discovered I could
+sidestep this by using `query-replace-regexp` instead, and entering the
+regular expression `\bh\b` as the replacement text (the `\b` stands for a word
+_b_oundary, so it will only match the entire “word” `h`). On the other hand,
+it’s a bit more work, and I thought there must be a better place to do “whole-
+word” replacement. A bunch of Googling led me to [this Stack Exchange answer](http://emacs.stackexchange.com/a/12691/12975), which says that using
+the `universal-argument` command `C-u` in Emacs _before_ a `query-replace`
+will do exactly what I want. I never knew about `universal-argument` before –
+always good to learn new tricks!
+
+At this point, I don’t anticipate doing much more with the dewarping code. It
+could definitely use a thorough round of commenting, but the basics are pretty
+much spelled out in this document, so I’ll just slap a link here on the
+[github repo](https://github.com/mzucker/page_dewarp) and call it a day. Who
+knows – maybe I’ll refer back to this project again the next time I teach
+[computer vision](http://www.swarthmore.edu/NatSci/mzucker1/e27_s2016/)…
+
+
diff --git a/raw/Static types in Python, oh my(py)!.md b/raw/Static types in Python, oh my(py)!.md
new file mode 100644
index 0000000..53922d0
--- /dev/null
+++ b/raw/Static types in Python, oh my(py)!.md
@@ -0,0 +1,151 @@
+原文:[Static types in Python, oh my(py)!](http://blog.zulip.org/2016/10/13/static-types-in-python-oh-mypy/)
+
+---
+
+Over the last few years, static type checkers have become available for popular dynamic languages like PHP ([Hack](http://hacklang.org/)) and JavaScript ([Flow](https://flowtype.org/) and [TypeScript](https://www.typescriptlang.org/)), and have seen wide adoption. Two years ago, a [provisional syntax for static type annotations](https://www.python.org/dev/peps/pep-0484/) was added to Python 3\. However, static types in Python have yet to be widely adopted, because the tool for checking the type annotations, [mypy](http://mypy-lang.org/), was not ready for production use… until now!
+
+The exciting news is that over the last year, a team at Dropbox (including Python creator Guido van Rossum!) has led the development of mypy into a mature type checker that can enforce static type consistency in Python programs. For the many programmers who work in large Python 2 codebases, the even more exciting news is that mypy has full support for type-checking Python 2 programs, scales to large Python codebases, and can substantially simplify the upgrade to Python 3\.
+
+The Zulip development community has seen this in action during 2016\. [Zulip](https://zulip.org/) is a popular open source group chat application, complete with apps for all major platforms, a REST API, dozens of integrations, etc. To give you a sense of scale, Zulip is about 50,000 lines of Python 2, with dozens of developers contributing hundreds of commits every month. During 2016, we have annotated 100% (!) of our backend with static types using mypy, and thanks to mypy, we are on the verge of switching to Python 3\. Zulip is now the largest open source Python project that has fully adopted static types, though I doubt we’ll hold that title for long :).
+
+In this post, I’ll explain how mypy works, the benefits and pain points we’ve seen in using mypy, and share a detailed guide for adopting mypy in a large production codebase (including how to find and fix dozens of issues in a large project in the first few days of using mypy!).
+
+# A brief introduction to mypy
+
+Here is an example of the clean mypy/PEP-484 annotation syntax in Python 3:
+
+```
+def sum_and_stringify(nums: List[int]) -> str:
+ """Adds up the numbers in a list and returns the result as a string."""
+ return str(sum(nums))
+
+```
+
+And here’s how that same code looks using the comment syntax for both Python 2 and 3:
+
+```
+def sum_and_stringify(nums):
+ # type: (List[int]) -> str
+ """Adds up the numbers in a list and returns the result as a string."""
+ return str(sum(nums))
+
+```
+
+With this comment syntax, mypy supports type-checking normal Python 2 programs, and programs annotated using mypy will run normally with any Python runtime environment (mypy shares this excellent property with the Flow JavaScript type checker). This is awesome because it means projects can adopt mypy without changing anything about how they run Python.
+
+You run mypy on your codebase like a linter, and it reports errors in a nice compiler-style format. For example, here’s what the mypy output would be if I’d incorrectly annotated `sum_and_stringify` as returning a float:
+
+```
+$ mypy /tmp/test.py
+/tmp/test.py: note: In function "sum_and_stringify":
+/tmp/test.py:6: error: Incompatible return value type: expected builtins.float, got builtins.str
+
+```
+
+If you’re curious how to annotate something, check out the [mypy syntax cheat sheet](http://mypy.readthedocs.io/en/latest/cheat_sheet.html) (for simple things) and [PEP-484](https://www.python.org/dev/peps/pep-0484/) (for complex things); they’re great resources. If you want to play with mypy now, you can install it with `pip3 install mypy-lang`.
+
+When mypy has complete type annotations for a module as well as its dependencies, then it can provide a very strong consistency check, similar to what the compiler provides in statically typed languages. Mypy uses the [typeshed](https://github.com/python/typeshed) repository of type “stubs” (type definitions for a module in the style of a header file) to provide type data for both the Python standard library and dozens of popular libraries like requests, six, and sqlalchemy. Importantly, mypy is designed for gradually adding types; if type data for an import isn’t available, it just treats that import as being consistent with anything.
+
+# Benefits of using mypy
+
+Here are the benefits we’ve seen from adopting mypy, starting with the most important:
+
+* **Static type annotations improve readability**. Improved readability is probably the most important benefit of statically typed Python. With static types, every function clearly documents the types of its arguments and return value, which saves a ton of time reading the codebase to figure out the precise types a function expects so that you can call it. In many large codebases, developers write docstrings that contain little information other than the types of the arguments and often end up out of date. Type annotations are far better since they are automatically verified for correctness and consistency. It turned out that the code that we found most difficult to annotate was precisely the code where the type annotations improved readability the most: the cases where it was totally unclear from the code alone what all the objects being passed around are.
+* **We can refactor with confidence**. Refactoring a Python codebase without breaking things involves a lot of careful manual checking to confirm that you’ve updated all the code that used the old interface. With a statically typed Python codebase, the type checker can often verify this for you automatically. We’ve already found this to be [super useful when refactoring Zulip](https://github.com/zulip/zulip/commit/eb09dd217dbd5d9c214b14a5b28b3a12cf8b7854).
+* **We upgraded to Python 3**. Thanks to mypy, Zulip now supports Python 3\. For years, the 2to3 library has been able to do automatically most of the changes required to support Python 3\. The hard part of the migration is the unicode issue: Python 3 is much more strict about the distinction between arrays of bytes and strings than Python 2\. Mypy annotations made this project much easier, since we could explicitly declare which values were which across the entire codebase, and check those declarations for consistency. Even in Zulip, which has an unusually comprehensive automated test suite, mypy caught many Python 3 issues that the tests did not find.
+* **mypy frequently catches bugs**. When we started using mypy, Zulip was a mature web application with unusually high test coverage, so we didn’t find any super embarrassing bugs as part of annotating the codebase. However, mypy [flagged](https://github.com/zulip/zulip/commit/620411c0ea757c9f3f0ae08597e77de84f20ee97) [dozens](https://github.com/zulip/zulip/commit/d77c70220c91a47b06695409e178bb006fe996e8) [of](https://github.com/zulip/zulip/commit/fc02ea9f674e44d99eb12528fceaed64a950bbed) [latent](https://github.com/zulip/zulip/commit/e9f39922a09d2aad90de63a349abe408e93edd4d) [bugs](https://github.com/zulip/zulip/commit/7595e4b05fe69a786cf83cad9c8a77f6e47b2f0f), and dozens more [pieces](https://github.com/zulip/zulip/commit/eee36618fe297bf306c19a7deb11941b53680272) of [confusing](https://github.com/zulip/zulip/commit/c55ac01ae69ef1ca8ce8315ec4b179ec3ebb46a6) code. I’m glad that mypy helped us purge those categories of issues from the codebase. More important for us is that running mypy on code that hasn’t yet been reviewed or tested regularly saves time by catching bugs.
+* **Static types highlight bad interfaces**. Annotated functions with messy interfaces (e.g sometimes returning a `List` and sometimes a `Tuple` ) really stand out.
+
+# Pain points
+
+To provide a complete picture of the experience of adopting mypy, I think it’s important to also talk about the pain points with mypy today:
+
+* **Interactions with “unused import” linters**: Currently, linters like `pyflakes` don’t read the Python 2 type annotations style (since the annotations are comments, after all!), and thus will report that imports needed for mypy are unused. Since cleaning up unused imports is only marginally useful anyway, we just filtered out those warnings. This problem will go away when we move to the new Python 3 type syntax, since then type annotations are visible “users” of the imports to tools like pyflakes; I also expect that popular Python linters will add an option to check imports in type comments before long.
+* **Import cycles**: We needed to create a few import cycles in order to import types that were used as arguments (but not explicitly referenced) in a file. One can work around this (in the Python 2 comment syntax) using e.g. `if False: from x import y` (soon to be the cleaner `if typing.TYPE_CHECKING` ), but it won’t work when we move to the Python 3 syntax. I’m hopeful that someone will come up with a nice solution to the import cycle creation issue, which may just take the form of recommendations on how to organize one’s code to avoid this.
+
+## Non-pain points
+
+This section discusses the things that (before trying mypy) I was worried might be problems, but that after adopting mypy I don’t think are significant issues:
+
+* **Training**. Zulip has had well over 100 contributors at this point, so the thing I was most worried about when adopting mypy was that it could hurt the project by adding a new obscure technology that contributors need to learn. This has not turned out to be a material problem: new contributors usually annotate their code correctly on the first try. Python programmers know the Python type system, and the PEP-484 type syntax is a pretty intuitive way of writing it down. The [mypy cheat sheet](http://mypy.readthedocs.io/en/latest/cheat_sheet.html) helps a lot by providing a single place to look up all the common stuff.
+* **False positives**. Mypy is well-engineered and designed around the idea of only reporting things that are actual inconsistencies in a program’s types. It has a far lower false positive rate than the commercial static analyzers I’ve used! And, it has a really nice `type: ignore` system for silencing false positives to get clean output. Finally, we’ve had good luck with typing even some highly dynamic parts of our system (e.g. our [framework for parsing REST API requests](http://zulip.readthedocs.io/en/latest/writing-views.html#request-variables)).
+* **Mypy and typeshed bugs**. Zulip started using mypy in January 2016, when mypy itself was essentially the only open source project using mypy. Because we were the first webapp and the second large project using mypy seriously, we encountered a lot of small bugs at the beginning (in total, Zulip developers have reported ~50 issues in mypy and submitted 16 PRs for typeshed). Essentially all of those issues were fixed upstream (with tests!) months ago, and we rarely encounter regressions. So, I wouldn’t expect similar projects to encounter a similar scale of issues; exactly how many issues an individual project encounters will depend a lot on whether that project uses the same corners of Python as other projects that already use mypy and typeshed. Even as a super early adopter, I feel like we spent far more time annotating our code than investigating, working around with `type: ignore` , and reporting bugs.
+* **Performance.** Mypy today is really fast compared to other static analyzers I’ve used. With the mypy module cache, it takes about 3s to check the entire Zulip codebase and thus is about as fast as pyflakes (one of the faster Python linters). However, because mypy detects cross-file issues (unlike most linters), it doesn’t have a really fast (e.g. <100ms) way to fully check if a given diff/commit introduced a new issue.
+* **Community**. The mypy and typeshed communities are amazing and are impressively responsive to good bug reports (aka ones with a clean, simple reproducer, which I can almost always generate in under 5 minutes).
+
+# Finding bugs in your first few days with mypy
+
+This section details what you need to do in order to start benefiting from mypy in a large codebase. To give you a sense of the scale of work involved, I did everything described in this section in 4 days at a hackathon in January (and back then, mypy wasn’t mature, so I spent half the time filing good bug reports for all the issues I found). If you’re thinking about using mypy but want more data to help make a decision, I’d recommend completing the steps discussed in this section. It’s a pretty good effort/value tradeoff.
+
+**Read the mypy cheat sheet.** The [mypy cheat sheet](http://mypy.readthedocs.io/en/latest/cheat_sheet.html) provides a great overview of the PEP-484 syntax, and you’ll be referring to it often as you start writing annotations.
+
+**Standardize how you’ll run mypy**. Write tooling to [install](https://github.com/zulip/zulip/blob/master/tools/install-mypy) and [run](https://github.com/zulip/zulip/blob/master/tools/run-mypy) `mypy` against your codebase, so that everyone using the project can run the type checker the same way. Two features are important in how you run mypy:
+
+* Support for determining which files should be checked (a whitelist/exclude list is useful!).
+* Specifying the correct flags for your project at this time. For a Python 2 project, I recommend starting with `mypy --py2 --silent-imports --fast-parser -i `. You should be able to do this using a [mypy.ini file](http://mypy.readthedocs.io/en/latest/config_file.html).
+
+**Get mypy running** on your codebase with clean output. This usually requires just adding type annotations for empty global data structures. Back in January, this took a few hours of work (including reporting bugs with reproducers). It’s probably even less work now. By default, mypy will only check the interior of functions that are annotated, so with an unannotated codebase, this is just making sure mypy can analyze your entire codebase.
+
+**Check for basic consistency**. Add the `--check-untyped-defs` option to your mypy arguments, and get that running with no errors on the codebase. This option causes mypy to check every `def` in the codebase for internal consistency; mypy can detect many classes of bugs and mistakes in your codebase, without your having written a single type annotation!
+
+In most cases, you’ll want to fix bugs and bad code as you go, but you can also use `#type: ignore` annotations or exclude files to defer issues. For example, we excluded all of Zulip’s tests at first, since they’re both lower value to type-check and had most of the monkey-patching and other questionable Python in the project. For Zulip, getting clean `--check-untyped-defs` output took me about 2 days of hard work, including merging fixes for about 40 issues in the Zulip codebase.
+
+I spent another day or two generating nice reproducers for the mypy bugs I’d encountered and improving typeshed. Now that mypy is no longer in its infancy, mypy bugs are increasingly rare. But a large project should expect to encounter and fix typeshed issues (just submit a PR!).
+
+**Run mypy in continuous integration**. Once `mypy --check-untyped-defs` passes on your codebase, you should lock down your progress by running the `mypy` checker in your CI environment.
+
+Because mypy type annotations are optional, once you’ve done the setup above, you can start annotating your codebase at whatever pace you like. Over time, you’ll get the benefits of static types in the parts of the codebase that you’ve annotated, without needing to do anything special to the rest of the codebase (the wonders of gradual typing!). In the next section, I’ll discuss strategy for how to move the codebase towards being fully annotated.
+
+# Fully annotating a large codebase
+
+This section discusses what’s involved in going from having mypy set up to having a fully annotated codebase.
+
+The great thing about mypy is that you can do everything gradually. After the initial setup, we actually didn’t do anything with mypy for a couple months. That changed when we posted mypy annotations as one of our project ideas for Google Summer of Code (GSoC). We found an incredible student, Eklavya Sharma, for the project. Eklavya did the vast majority of the hard work of annotating Zulip, including upgrading our tooling, annotating the core library, contributing bug reports and PRs to mypy and typeshed upstream, and fixing all the mistakes we made early on. Amazingly, he also found the time during the summer to migrate Zulip to use virtualenvs and then upgrade Zulip to Python 3!
+
+One can divide the work of annotating a large project into a few phases.
+
+**Phase 1: Annotate the core library**. Strategically, you want to annotate the core library code that is used in lots of other files first. The annotations for these functions impose constraints on the types used in the rest of the codebase, so you’ll spend less time fixing incorrect annotations and catch more actual bugs faster if you do these files first. This is also a good time to [document how your project is using mypy](http://zulip.readthedocs.io/en/latest/mypy.html) (and link that doc from mypy failures in the CI system).
+
+**Phase 2: Annotate the bulk of the codebase**. Many projects will likely do this slowly over many months as developers touch the different parts of the codebase, which is a pretty reasonable strategy.
+
+It also works well to annotate a codebase as a focused effort; the story of how we did this for Zulip is instructive. About halfway through Eklavya’s summer of code, we went to the [PyCon sprints](https://us.pycon.org/2016/community/sprints/) with the goal of annotating as much of Zulip as possible. The PyCon sprints are my favorite part of PyCon. It’s an awesome 4-day party after the main PyCon conference where hundreds of developers work on open source projects together. It’s completely free to attend, and is a great opportunity to get involved in contributing to open source projects.
+
+We grabbed a few tables next to the mypy developers, and managed to attract a rotating cast of 5–10 developers to the Zulip mypy annotation project each day. During the PyCon sprints, Zulip went from 17% annotated to 85% (with 25–30 engineer-days of work, mostly by folks new to both Zulip and mypy). We used mypy’s coverage support and coveralls.io to track progress, but the more fun progress bar was our giant sheet of paper, pictured here at the start of the last day:
+
+
+
+Our PyCon experience is I think the best proof that mypy is accessible to new developers: aside from me, all of the contributors adding annotations were new to both Zulip and mypy. We found that with a good 5-minute demo and good documentation, new contributors were effective within their first hour of working with mypy. I would definitely recommend the mypy hackathon approach to other open source projects — it’s a great way for contributors to have meaningful impact on an unfamiliar project.
+
+**Phase 3: Get to 100%.** Annotating the last several files is relatively difficult work, because this is where you end up debugging all the mistakes you made in Phase 2\. While you do this, it’s important to lock down files/directories that reach 100% by adding the `--disallow-untyped-defs` option (which will report any functions missing type annotations) to the `mypy` flags to prevent regressions.
+
+Eklavya brought us from 85% to 96% before his college started up again, and then a few weeks ago we spent the couple hours needed to get to 100%. Now, all new Python code going into Zulip comes with mypy annotations (with the exception of a shrinking list of scripts, settings, and test files).
+
+**Phase 4: Celebrate and write a blog post**! At least, that was the next step for Zulip :)
+
+Overall, the focused time that went into fully annotating Zulip was a 1-week hackathon, a GSOC project, and a party at the PyCon sprints. In the scheme of things, this is a pretty modest level of effort.
+
+I should mention that even though Zulip is 100% annotated, Zulip’s mypy journey is not complete. We will eventually want to add stubs to typeshed for the most important libraries used by Zulip (e.g. Django).
+
+# Recommendations for annotating code
+
+We have a few recommendations for annotating that will likely save you a lot of time:
+
+* Make sure to handle `str` vs. `Text` correctly. Bytes vs. str and str vs. unicode errors are the majority of the problem for moving a codebase from Python 2 to Python 2+3\. If you do this right as you annotate your codebase, you’ll save yourself a lot of time when you upgrade to Python 3; we found the Python 3 upgrade went very quickly once we had the codebase mostly annotated. We ended up adding [a couple helper functions](https://github.com/zulip/zulip/blob/master/zerver/lib/str_utils.py) to do casts between `str` , `Text` , and `bytes` correctly (and readably!) in our codebase on both Python 2 and Python 3.
+* Remember to annotate class variables! We neglected to do this when we first annotated our Django models file, and ended up having to fix a number of incorrect annotations (primarily around Text vs. bytes) that got into the codebase because they weren’t being cross-checked against anything.
+* Avoid using a guess-and-check approach when adding annotations. In a partially annotated codebase, mypy will still detect many classes of errors in annotations, but it can’t detect every error (since the inconsistency might be with code you haven’t annotated yet). So you should make sure people writing annotations are actually understanding/tracing the code, and that you code review the annotations just like actual code. We found writing one commit per large file (or collection of related smaller files) to be a good commit discipline.
+* Use precise types where possible (e.g. avoid using `Any` ).
+* When using `type: ignore` to work around a potential mypy or typeshed bug, I recommend using the following style to record the original GitHub issue:
+
+ `bad_code # type: ignore # https://github.com/python/typeshed/issues/372`
+
+ That way, it’s easy for future you to verify whether the issue that required that use of `type: ignore` has since been fixed upstream. If a file would require a lot of `type: ignore` annotations, you can always add it to the exclude list (another feature of our `run-mypy` wrapper) and plan to come back to it later.
+
+# Conclusion
+
+Overall, the experience of using mypy (and the PEP-484 type system) has been awesome, and we feel that adopting mypy has been a big advance for the Zulip project. It improves readability, catches bugs in code without running it, has very few false positives, and hasn’t come with significant downsides. Deploying mypy in a large codebase required relatively little investment on our part, and annotating the codebase had the side benefit of making our Python 3 migration feel easy.
+
+If you have a large Python codebase and would like to make working in your codebase better, you should find a week to get started with mypy!
+
+Finally, if you’re excited to see what static types in Python look like in a large codebase, check out the [Zulip server project on GitHub](https://github.com/zulip/zulip/). We welcome new contributors!
+
+Huge thanks to Guido van Rossum, Alya Abbott, Steve Howell, Jason Chen, Eklavya Sharma, Anurag Goel, and Joshua Simmons for their feedback on this post.
diff --git "a/raw/Stupid Python Tricks\357\200\272 Abusing Explicit Self.md" "b/raw/Stupid Python Tricks\357\200\272 Abusing Explicit Self.md"
new file mode 100644
index 0000000..3193b1c
--- /dev/null
+++ "b/raw/Stupid Python Tricks\357\200\272 Abusing Explicit Self.md"
@@ -0,0 +1,142 @@
+原文:[Stupid Python Tricks: Abusing Explicit Self](https://medium.com/@hwayne/stupid-python-tricks-abusing-explicit-self-53d46b72e9e0)
+
+---
+
+在Ruby中,这样定义一个方法:
+
+```ruby
+class Foo
+ def bar(baz)
+ baz
+ end
+end
+```
+
+在Python中,这样定义一个方法:
+
+```python
+class Foo:
+ def bar(self, baz): # What's self doing there?
+ return baz # Not even using it
+```
+这个“显式self”是许多许多许多讨论的主题,而大量的人觉得它相当令人困惑。我想提供一个简短的解释,并且找到一些你永远不应该做的滥用显式self的方式。准备好了吗?那就开始吧。
+
+# 奇怪的实例方法
+
+让我们定义以下类:
+
+```python
+class Number:
+ def __init__(self, x: int) -> None:
+ self.x = x
+ self.minus = lambda y: self.x - y # we'll come back to this
+ def plus(self, y: int) -> int:
+ return self.x + y
+one = Number(1)
+```
+
+one.plus(2)是什么?那相当简单:
+
+```python
+>>> print(one.plus(2))
+3
+```
+
+但如果是Number.plus(one, 2)呢?
+
+```python
+>>> print(Number.plus(one, 2)) # ???
+3 # !!!
+```
+
+Here, plus is a “bound method”. The object doesn’t *furious air quotes* really have a method called ‘plus’. Instead, “one.plus(N)” is a shorthand for “Number.plus(one, N)”, the “unbound method”. With that in mind, it’s pretty obvious why self is needed: plus is a class method that takes two parameters: the object we’re using, and the number we’re adding.
+这里,plus是一个“绑定方法”。
+
+It’s a little more complicated than that (Python 3 simplifies things a bit), but that’s a pretty useful lie that makes it a lot easier to work out the logic here. Let’s provide a slightly more complex example:
+
+```python
+>>> Number.times = lambda self, y: self.x * y
+>>> print(one.times(2)) # automatically valid
+2
+```
+
+When we call “one.times(2)”, it automatically translates that to “Number.times(one, 2)”, which we just defined. This becomes a lot easier to reason about with the explicit self.
+
+Now let’s double back to that “self.minus” in the initializer. There, we’re not defining the method “properly”. It’s not part of the class definition. We’re just assigning a property to the object which also happens to be callable. Since this isn’t a proper method, it shouldn’t be translated to a bound method in the actual object, so we can drop the self. Similarly, there shouldn’t be an unbound version on the class, so trying to use that will be an error. Let’s give that a test:
+
+```python
+>>> print(one.minus(2))
+-1
+>>> print(Number.minus(one, 2))
+Traceback (most recent call last):
+ File "", line 1, in
+AttributeError: type object 'Number' has no attribute 'minus'
+```
+
+Boom.
+
+# 回调地狱
+
+Quick aside before the really stupid stuff. Recall that in Python, functions are first class objects. That means that you can pass functions to other functions like any other kind of object. This means you can do the following:
+
+```python
+>>> def subtract_two(minus):
+... return(minus(2))
+...
+>>> print(subtract_two(one.minus))
+-1
+```
+
+The function we pass in is the bound method of our object. This means that when we call it, it has access to the original object. We can use this for callbacks. Instead of giving a function complete access to our object, we give it only minimal access: just one method. Sometimes this is a good idea. Other times, it means you have to rethink your architecture. If you don’t see why, spend a few weeks writing Javascript front-ends and get back to me.
+
+(Aside on the aside: you should probably check out Javascript anyway, because object prototyping is really close to what we’re doing with unbound methods and JS uses prototyping much more openly and explicitly.)
+
+# 真正愚蠢的东西
+
+Okay, back on track. We’ve seen that if you create a new unbound method, it adds a bound method to all existing objects. It also works with modification: change the unbound method and you change all the bound methods. This means you can dynamically adjust a class based on the runtime. For example, if you’re getting a ton of calls to a particular method, you can start caching it.
+
+Here’s a toy example. We want to add a division method to Number. We want to start logging all divisions if and only if at some point we try to divide by zero. We can do it like this:
+
+```python
+def _divide(self, y):
+ try:
+ return self.x / y
+ except ZeroDivisionError:
+ print("Tried to divide {} by 0".format(self.x))
+ def __newdivide(self, y):
+ print("{} / {}".format(self.x, y))
+ return _divide(self, y)
+ self.__class__.divide = __newdivide
+Number.divide = _divide
+>>> six = Number(6)
+>>> eight = Number(8)
+>>>
+>>> print(six.divide(3))
+2.0
+>>> print(eight.divide(0))
+Tried to divide 8 by 0
+None
+>>> print(six.divide(3))
+6 / 3
+2.0
+```
+
+One object throwing an error triggered improved logging on a completely different one. There’s a few extremely specific cases where manipulating unbound methods might be a good idea. Practically, though, don’t do this in production code. Programming is hard enough as it is. You don’t need your method definitions changing underneath you.
+
+# 总结
+
+当你定义一个类上的实例方法时,它创建了该对象上的一个对应的“绑定”方法。这是类上的非绑定方法的别名。如果你直接在对象上定义实例方法,那么类上就没有对应的非绑定方法。
+
+修改非绑定方法也会改变所有现有的绑定方法。你可以用它来动态修改类。
+
+不要。
+
+真的,不要。
+
+# 进一步阅读
+
+* [Python method对象](https://docs.python.org/3/tutorial/classes.html#method-objects)
+* [Javascript中的原型](http://javascriptissexy.com/javascript-prototype-in-plain-detailed-language/)
+* [静态、类和抽象方法](https://julien.danjou.info/blog/2013/guide-python-static-class-abstract-methods) (还有一个关于Py2是如何有点复杂的重要注解)
+* [回调地狱](http://callbackhell.com/)
+* [回调地狱的一个反证](http://thecodebarbarian.com/2015/03/20/callback-hell-is-a-myth)
\ No newline at end of file
diff --git a/raw/The Python Packaging Ecosystem.md b/raw/The Python Packaging Ecosystem.md
new file mode 100644
index 0000000..fa72900
--- /dev/null
+++ b/raw/The Python Packaging Ecosystem.md
@@ -0,0 +1,391 @@
+原文:[The Python Packaging Ecosystem](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html)
+
+---
+
+从开发到部署
+
+[TOC]
+
+最近已经有一些文章从最终用户的角度反映Python包生态圈的现状,因此,作为该生态圈的主架构师之一,对我来说,值得从我的角度写写我如何描述软件出版发行的整体问题空间,此刻我认为我们所处的境地,以及我所希望看到的未来的发展。
+
+作为背景,我回复的具体文章是:
+
+ * [Python Packaging is Good Now](https://glyph.twistedmatrix.com/2016/08/python-packaging.html) (Glyph Lefkowitz)
+ * [Conda:神话和误解](https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/) (Jake VanderPlas)
+ * [PayPal的Python Packaging](https://www.paypal-engineering.com/2016/09/07/python-packaging-at-paypal/) (Mahmoud Hashemi)
+
+These are all excellent pieces considering the problem space from different
+perspectives, so if you'd like to learn more about the topics I cover here, I
+highly recommend reading them.
+
+## [我的核心软件生态设计理念](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#id1)
+
+Since it heavily influences the way I think about packaging system design in
+general, it's worth stating my core design philosophy explicitly:
+
+ * As a software consumer, I should be able to consume libraries, frameworks, and applications in the binary format of my choice, regardless of whether or not the relevant software publishers directly publish in that format
+ * As a software publisher working in the Python ecosystem, I should be able to publish my software once, in a single source-based format, and have it be automatically consumable in any binary format my users care to use
+
+This is emphatically _not_ the way many software packaging systems work - for
+a great many systems, the publication format and the consumption format are
+tightly coupled, and the folks managing the publication format or the
+consumption format actively seek to use it as a lever of control over a
+commercial market (think operating system vendor controlled application
+stores, especially for mobile devices).
+
+While we're unlikely to ever pursue the specific design documented in the rest
+of the PEP (hence the "Deferred" status), the "[Development, Distribution, and Deployment of Python Software](https://www.python.org/dev/peps/pep-0426/#development-distribution-and-deployment-of-python-software)" section of PEP
+426 provides additional details on how this philosophy applies in practice.
+
+I'll also note that while I now work on software supply chain management
+tooling at Red Hat, that _wasn't_ the case when I first started actively
+participating in the upstream Python packaging ecosystem [design process](https://lwn.net/Articles/580399/). Back then I was working on Red
+Hat's main [hardware integration testing system](https://beaker-project.org/),
+and growing increasingly frustrated with the level of effort involved in
+integrating new Python level dependencies into Beaker's RPM based development
+and deployment model. Getting actively involved in tackling these problems on
+the Python upstream side of things then led to also getting more actively
+involved in addressing them on the [Red Hat downstream side](http://www.slideshare.net/ncoghlan_dev/developing-in-python-on-red-hat-platforms-devnation-2016).
+
+## [关键难题](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#id2)
+
+When talking about the design of software packaging ecosystems, it's very easy
+to fall into the trap of only considering the "direct to peer developers" use
+case, where the software consumer we're attempting to reach is another
+developer working in the same problem domain that we are, using a similar set
+of development tools. Common examples of this include:
+
+ * Linux distro developers publishing software for use by other contributors to the same Linux distro ecosystem
+ * Web service developers publishing software for use by other web service developers
+ * Data scientists publishing software for use by other data scientists
+
+In these more constrained contexts, you can frequently get away with using a
+single toolchain for both publication and consumption:
+
+ * Linux: just use the system package manager for the relevant distro
+ * Web services: just use the Python Packaging Authority's twine for publication and pip for consumption
+ * Data science: just use conda for everything
+
+For newer languages that start in one particular domain with a preferred
+package manager and expand outwards from there, the apparent simplicity
+arising from this homogeneity of use cases may frequently be attributed as an
+essential property of the design of the package manager, but that perception
+of inherent simplicity will typically fade if the language is able to
+successfully expand beyond the original niche its default package manager was
+designed to handle.
+
+In the case of Python, for example, distutils was designed as a consistent
+build interface for Linux distro package management, setuptools for plugin
+management in the Open Source Application Foundation's Chandler project, pip
+for dependency management in web service development, and conda for local
+language-independent environment management in data science. distutils and
+setuptools haven't fared especially well from a usability perspective when
+pushed beyond their original design parameters (hence the current efforts to
+make it easier to use full-fledged build systems like Scons and Meson as an
+alternative when publishing Python packages), while pip and conda both seem to
+be doing a better job of accommodating increases in their scope of
+application.
+
+This history helps illustrate that where things really have the potential to
+get complicated (even beyond the inherent challenges of domain-specific
+software distribution) is when you start needing to _cross domain boundaries_.
+For example, as the lead maintainer of `contextlib` in the Python standard
+library, I'm also the maintainer of the `contextlib2` backport project on
+PyPI. That's not a domain specific utility - folks may need it regardless of
+whether they're using a self-built Python runtime, a pre-built Windows or Mac
+OS X binary they downloaded from python.org, a pre-built binary from a Linux
+distribution, a CPython runtime from some other redistributor (homebrew,
+pyenv, Enthought Canopy, ActiveState, Continuum Analytics, AWS Lambda, Azure
+Machine Learning, etc), or perhaps even a different Python runtime entirely
+(PyPy, PyPy.js, Jython, IronPython, MicroPython, VOC, Batavia, etc).
+
+Fortunately for me, I _don't_ need to worry about all that complexity in the
+wider ecosystem when I'm specifically wearing my `contextlib2` maintainer hat
+- I just publish an sdist and a universal wheel file to PyPI, and the rest of
+the ecosystem has everything it needs to take care of redistribution and end
+user consumption without any further input from me.
+
+However, `contextlib2` is a pure Python project that only depends on the
+standard library, so it's pretty much the simplest possible case from a
+tooling perspective (the only reason I needed to upgrade from distutils to
+setuptools was so I could publish my own wheel files, and the only reason I
+haven't switched to using the _much_ simpler pure-Python-only flit instead of
+either of them is that that doesn't yet easily support publishing backwards
+compatible setup.py based sdists).
+
+This means that things get significantly more complex once we start wanting to
+use and depend on components written in languages other than Python, so that's
+the broader context I'll consider next.
+
+## [平台管理或者插件管理?](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#id3)
+
+When it comes to handling the software distribution problem in general, there
+are two main ways of approaching it:
+
+ * design a plugin management system that doesn't concern itself with the management of the application framework that _runs_ the plugins
+ * design a platform component manager that not only manages the plugins themselves, but _also_ the application frameworks that run them
+
+This "plugin manager or platform component manager?" question shows up over
+and over again in software distribution architecture designs, but the case of
+most relevance to Python developers is in the contrasting approaches that pip
+and conda have adopted to handling the problem of external dependencies for
+Python projects:
+
+ * pip is a _plugin manager_ for Python runtimes. Once you have a Python runtime (any Python runtime), pip can help you add pieces to it. However, by design, it won't help you manage the underlying Python runtime (just as it wouldn't make any sense to try to install Mozilla Firefox as a Firefox Add-On, or Google Chrome as a Chrome Extension)
+ * conda, by contrast, is a _component manager_ for a cross-platform platform that provides its own Python runtimes (as well as runtimes for other languages). This means that you can get _pre-integrated_ components, rather than having to do your own integration between plugins obtained via pip and language runtimes obtained via other means
+
+What this means is that pip, _on its own_, is not in any way a direct
+alternative to conda. To get comparable capabilities to those offered by
+conda, you have to add in a mechanism for obtaining the underlying language
+runtimes, which means the alternatives are combinations like:
+
+ * apt-get + pip
+ * dnf + pip
+ * yum + pip
+ * pyenv + pip
+ * homebrew (Mac OS X) + pip
+ * python.org Windows installer + pip
+ * Enthought Canopy
+ * ActiveState's Python runtime + PyPM
+
+This is the main reason why "just use conda" is excellent advice to any
+prospective Pythonista that isn't already using one of the platform component
+managers mentioned above: giving that answer replaces an otherwise operating
+system dependent or Python specific answer to the runtime management problem
+with a cross-platform and (at least somewhat) language neutral one.
+
+It's an especially good answer for Windows users, as chocalatey/OneGet/Windows
+Package Management isn't remotely comparable to pyenv or homebrew at this
+point in time, other runtime managers don't work on Windows, and getting folks
+bootstrapped with MinGW, Cygwin or the new (still experimental) Windows
+Subsystem for Linux is just another hurdle to place between them and whatever
+goal they're learning Python for in the first place.
+
+However, conda's pre-integration based approach to tackling the external
+dependency problem is also why "just use conda for everything" isn't a
+sufficient answer for the Python software ecosystem as a whole.
+
+If you're working on an operating system component for Fedora, Debian, or any
+other distro, you actually _want_ to be using the system provided Python
+runtime, and hence need to be able to readily convert your upstream Python
+dependencies into policy compliant system dependencies.
+
+Similarly, if you're wanting to support folks that deploy to a preconfigured
+Python environment in services like AWS Lambda, Azure Cloud Functions, Heroku,
+OpenShift or Cloud Foundry, or that use alternative Python runtimes like PyPy
+or MicroPython, then you need a publication technology that doesn't tightly
+couple your releases to a specific version of the underlying language runtime.
+
+As a result, pip and conda end up existing at slightly different points in the
+system integration pipeline:
+
+ * Publishing and consuming Python software with pip is a matter of "bring your own Python runtime". This has the benefit that you _can_ readily bring your own runtime (and manage it using whichever tools make sense for your use case), but also has the downside that you _must_ supply your own runtime (which can sometimes prove to be a significant barrier to entry for new Python users, as well as being a pain for cross-platform environment management).
+ * Like Linux system package managers before it, conda takes away the requirement to supply your own Python runtime by providing one for you. This is great if you don't have any particular preference as to which runtime you want to use, but if you _do_ need to use a different runtime for some reason, you're likely to end up fighting against the tooling, rather than having it help you. (If you're tempted to answer "Just add another interpreter to the pre-integrated set!" here, keep in mind that doing so without the aid of a runtime independent plugin manager like pip acts as a _multiplier_ on the platform level integration testing needed, which can be a significant cost even when it's automated)
+
+## [接下来,要做什么?](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#id4)
+
+In case it isn't already clear from the above, I'm largely happy with the
+respective niches that pip and conda are carving out for themselves as a
+plugin manager for Python runtimes and as a cross-platform platform focused on
+(but not limited to) data analysis use cases.
+
+However, there's still plenty of scope to improve the effectiveness of the
+collaboration between the upstream Python Packaging Authority and downstream
+Python redistributors, as well as to reduce barriers to entry for
+participation in the ecosystem in general, so I'll go over some of the key
+areas I see for potential improvement.
+
+### [可持续发展与旁观者效应](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#id5)
+
+It's not a secret that the core PyPA infrastructure (PyPI, pip, twine,
+setuptools) is [nowhere near as well-funded](https://caremad.io/posts/2016/05
+/powering-pypi/) as you might expect given its criticality to the operations
+of some truly enormous organisations.
+
+The biggest impact of this is that even when volunteers show up ready and
+willing to work, there may not be anybody in a position to effectively
+_wrangle_ those volunteers, and help keep them collaborating effectively and
+moving in a productive direction.
+
+To secure long term sustainability for the core Python packaging
+infrastructure, we're only talking amounts on the order of a few hundred
+thousand dollars a year - enough to cover some dedicated operations and
+publisher support staff for PyPI (freeing up the volunteers currently handling
+those tasks to help work on ecosystem improvements), as well as to fund
+targeted development directed at some of the other problems described below.
+
+However, rather than being a true "[tragedy of the
+commons](https://en.wikipedia.org/wiki/Tragedy_of_the_commons)", I personally
+chalk this situation up to a different human cognitive bias: the [bystander
+effect](https://en.wikipedia.org/wiki/Bystander_effect).
+
+The reason I think that is that we have _so many_ potential sources of the
+necessary funding that even folks that agree there's a problem that needs to
+be solved are assuming that someone else will take care of it, without
+actually checking whether or not that assumption is entirely valid.
+
+The primary responsibility for correcting that oversight falls squarely on the
+Python Software Foundation, which is why the Packaging Working Group was
+formed in order to investigate possible sources of additional funding, as well
+as to determine how any such funding can be spent most effectively.
+
+However, a secondary responsibility also falls on customers and staff of
+commercial Python redistributors, as this is _exactly_ the kind of ecosystem
+level risk that commercial redistributors are being paid to manage on behalf
+of their customers, and they're currently not handling this particular
+situation very well. Accordingly, anyone that's actually _paying_ for CPython,
+pip, and related tools (either directly or as a component of a larger
+offering), and expecting them to be supported properly as a result, really
+needs to be asking some very pointed question of their suppliers right about
+now. (Here's a sample question: "We pay you X dollars a year, and the upstream
+Python ecosystem is one of the things we expect you to support with that
+revenue. How much of what we pay you goes towards maintenance of the upstream
+Python packaging infrastructure that we rely on every day?").
+
+One key point to note about the current situation is that as a 501(c)(3)
+public interest charity, any work the PSF funds will be directed towards
+better fulfilling that public interest mission, and that means focusing
+primarily on the needs of educators and non-profit organisations, rather than
+those of private for-profit entities.
+
+Commercial redistributors are thus _far_ better positioned to properly
+represent their customers interests in areas where their priorities may
+diverge from those of the wider community (closing the "insider threat"
+loophole in PyPI's current security model is a particular case that comes to
+mind - see [Making PyPI security independent of
+SSL/TLS](http://www.curiousefficiency.org/posts/2016/09/python-packaging-
+ecosystem.html#making-pypi-security-independent-of-ssl-tls)).
+
+### [将PyPI迁移到pypi.org](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#id6)
+
+An instance of the new PyPI implementation (Warehouse) is up and running at
+ and connected directly to the production PyPI database, so
+folks can already explicitly opt-in to using it over the legacy implementation
+if they prefer to do so.
+
+However, there's still a non-trivial amount of design, development and QA work
+needed on the new version before all existing traffic can be transparently
+switched over to using it.
+
+Getting at least this step appropriately funded and a clear project management
+plan in place is the main current focus of the PSF's Packaging Working Group.
+
+### [使得编译器的存在在终端用户系统上可选](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#id7)
+
+Between the `wheel` format and the `manylinux1` usefully-distro-independent
+ABI definition, this is largely handled now, with `conda` available as an
+option to handle the relatively small number of cases that are still a problem
+for `pip`.
+
+The main unsolved problem is to allow projects to properly express the
+constraints they place on target environments so that issues can be detected
+at install time or repackaging time, rather than only being detected as
+runtime failures. Such a feature will also greatly expand the ability to
+correctly generate platform level dependencies when converting Python projects
+to downstream package formats like those used by conda and Linux system
+package managers.
+
+### [在终端用户的系统上引导依赖管理工具](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#id8)
+
+With pip being bundled with recent versions of CPython (including CPython 2.7
+maintenance releases), and pip (or a variant like upip) also being bundled
+with most other Python runtimes, the ecosystem bootstrapping problem has
+largely been addressed for new Python users.
+
+There are still a few usability challenges to be addressed (like defaulting to
+per-user installations when outside a virtual environment, interoperating more
+effectively with platform component managers like conda, and providing an
+officially supported installation interface that works at the Python prompt
+rather than via the operating system command line), but those don't require
+the same level of political coordination across multiple groups that was
+needed to establish pip as the lowest common denominator approach to
+dependency management for Python applications.
+
+### [让distutils和setuptools的使用可选optional](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#id9)
+
+As mentioned above, distutils was designed ~18 years ago as a common interface
+for Linux distributions to build Python projects, while setuptools was
+designed ~12 years ago as a plugin management system for an open source
+Microsoft Exchange replacement. While both projects have given admirable
+service in their original target niches, and quite a few more besides, their
+age and original purpose means they're significantly more complex than what a
+user needs if all they want to do is to publish their pure Python library or
+framework to the Python Package index.
+
+Their underlying complexity also makes it incredibly difficult to improve the
+problematic state of their documentation, which is split between the legacy
+distutils documentation in the CPython standard library and the additional
+setuptools specific documentation in the setuptools project.
+
+Accordingly, what we want to do is to change the way build toolchains for
+Python projects are organised to have 3 clearly distinct tiers:
+
+ * toolchains for pure Python projects
+ * toolchains for Python projects with simple C extensions
+ * toolchains for C/C++/other projects with Python bindings
+
+This allows folks to be introduced to simpler tools like flit first, better
+enables the development of potential alternatives to setuptools at the second
+tier, and supports the use of full-fledged pip-installable build systems like
+Scons and Meson at the third tier.
+
+The first step in this project, defining the `pyproject.toml` format to allow
+declarative specification of the dependencies needed to launch `setup.py`, has
+been implemented, and Daniel Holth's `enscons` project demonstrates that that
+is already sufficient to bootstrap an external build system even without the
+later stages of the project.
+
+Future steps include providing native support for `pyproject.toml` in `pip`
+and `easy_install`, as well as defining a declarative approach to invoking the
+build system rather than having to run `setup.py` with the relevant distutils
+& setuptools flags.
+
+### [使得PyPI的安全独立于SSL/TLS](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#id10)
+
+PyPI currently relies entirely on SSL/TLS to protect the integrity of the link
+between software publishers and PyPI, and between PyPI and software consumers.
+The only protections against insider threats from within the PyPI
+administration team are ad hoc usage of GPG artifact signing by some projects,
+personal vetting of new team members by existing team members and 3rd party
+checks against previously published artifact hashes unexpectedly changing.
+
+A credible design for end-to-end package signing that adequately accounts for
+the significant usability issues that can arise around publisher and consumer
+key management has been available for almost 3 years at this point (see
+[Surviving a Compromise of PyPI](https://www.python.org/dev/peps/pep-0458/)
+and [Surviving a Compromise of PyPI: the Maximum Security
+Edition](https://www.python.org/dev/peps/pep-0480/)).
+
+However, implementing that solution has been gated not only on being able to
+first retire the legacy infrastructure, but also the PyPI administators being
+able to credibly commit to the key management obligations of operating the
+signing system, as well as to ensuring that the system-as-implemented actually
+provides the security guarantees of the system-as-designed.
+
+Accordingly, this isn't a project that can realistically be pursued until the
+underlying sustainability problems have been suitably addressed.
+
+### [自动化wheel创建](http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#id11)
+
+While redistributors will generally take care of converting upstream Python
+packages into their own preferred formats, the Python-specific wheel format is
+currently a case where it is left up to publishers to decide whether or not to
+create them, and if they do decide to create them, how to automate that
+process.
+
+Having PyPI take care of this process automatically is an obviously desirable
+feature, but it's also an incredibly expensive one to build and operate.
+
+Thus, it currently makes sense to defer this cost to individual projects, as
+there are quite a few commercial continuous integration and continuous
+deployment service providers willing to offer free accounts to open source
+projects, and these can also be used for the task of producing release
+artifacts. Projects also remain free to only publish source artifacts, relying
+on pip's implicit wheel creation and caching and the appropriate use of
+private PyPI mirrors and caches to meet the needs of end users.
+
+For downstream platform communities already offering shared build
+infrastructure to their members (such as Linux distributions and conda-forge),
+it may make sense to offer Python wheel generation as a supported output
+option for cross-platform development use cases, in addition to the platform's
+native binary packaging format.
diff --git a/raw/Visualizing relationships between python packages.md b/raw/Visualizing relationships between python packages.md
new file mode 100644
index 0000000..63e5b8e
--- /dev/null
+++ b/raw/Visualizing relationships between python packages.md
@@ -0,0 +1,592 @@
+原文:[Visualizing relationships between python packages](https://kozikow.com/2016/07/10/visualizing-relationships-between-python-packages-2/)
+
+---
+
+## 简介
+
+我使用了[BigQuery上的github数据](https://github.com/blog/2201-making-open-source-data-more-available%20),提取github repo上的前3500个python包的共同出现关系。[通过速度verlet整合的d3中的力导向图](https://github.com/d3/d3-force)实现了可视化。我还使用[python-igraph](https://pypi.python.org/pypi/python-igraph)中的算法聚类了图,并且将其更新到。
+
+参见d3可视化中的集群的截图(点击图片以获得在线版本):
+
+[](http://clustering.kozikow.com?center=numpy)
+
+下面是刚刚从graphistry提取的numpy集群( 点击图片以获得在线版本):[](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=numpycluster&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true)
+
+图形属性:
+
+ * 每一个节点是在github上找到的一个python包。在[DataFrame with nodes](https://kozikow.com/2016/07/10/visualizing-relationships-between-python-packages-2/#DataFrame-with-nodes)部分计算得到半径。
+ * 对于两个包A和B,边的权重是,其中,是在相同文件中包A和包B出现的次数。很快,我会将其迁移到[标准化的逐点相互信息](https://en.wikipedia.org/wiki/Pointwise_mutual_information#Normalized_pointwise_mutual_information_.28npmi.29),因为有点难用BigQuery来计算它。
+ * 移除权重小于0.1的边。
+ * 根据[仿真参数](https://kozikow.com/2016/07/10/visualizing-relationships-between-python-packages-2/#Simulation-parameters),按照速度verlet集成来d3算法搜索最小能量状态。
+
+你可以访问看看我的应用。你可以:
+
+ * 在URL中传递不同的包名作为查询参数。
+ * 水平和垂直滚动页面。
+ * 点击一个节点可以打开pypi上对应的页面。注意,并不是所有的包在pypi上都有。
+
+有趣的graphistry视图在下一节,[具体集群分析](https://kozikow.com/2016/07/10/visualizing-relationships-between-python-packages-2/#Analysis-of-specific-clusters)。
+
+图形可视化除了看着酷以外,往往缺乏可操作的见解。
+Types
+of insights you can use this for:
+
+ * Find packages you have been not aware of in the close proximity of other packages that you use.
+ * Evaluate different web development frameworks based on size, adoption and library availability (e.g. [Flask](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=flashcluster&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true) vs [django](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=djangocluster&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true)).
+ * Find some interesting python use cases, like [robotics cluster](http://clustering.kozikow.com/?center=rospy).
+
+[Revision history of this post is on github](https://github.com/kozikow/kozikow-blog/blob/master/clustering/clustering.org) in the [orgmode](https://kozikow.com/2016/05/21/very-powerful-data-analysis-environment-org-mode-with-ob-ipython/).
+
+## 具体集群分析
+
+In addition to d3 visualization I also clustered the data using the [python-igraph](https://pypi.python.org/pypi/python-igraph)
+`community_infomap().membership` and uploaded it to graphistry. Ability to
+exclude and filter by clusters was very useful.
+
+### 科学计算集群
+
+Unsurprisingly, it is centered on numpy. It is interesting that it is possible
+to see the divide between statistics and machine learning.
+
+ * [d3 link](http://clustering.kozikow.com/?center=numpy)
+ * [graphistry link](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=numpycluster&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true)
+
+### Web框架集群
+
+Web框架很有意思:
+
+#### d3链接
+
+ * It could be said that [sqlalchemy](http://clustering.kozikow.com/?center=sqlalchemy) is a center of web frameworks land.
+ * Found nearby, there's a massive and monolithic cluster for [django](http://clustering.kozikow.com/?center=django).
+ * Smaller nearby clusters for [flask](http://clustering.kozikow.com/?center=flask) and [pyramid](http://clustering.kozikow.com/?center=pyramid).
+ * [pylons](http://clustering.kozikow.com/?center=pylons), lacking a cluster of its own, in between django and sqlalchemy.
+ * Small cluster for [zope](http://www.zope.org/), also nearby sqlalchemy
+ * [tornado](http://clustering.kozikow.com/?center=tornado) got swallowed by the big cluster of standard library in the middle, but is still close to other web frameworks.
+ * Some smaller web frameworks like [gluon (web2py)](http://clustering.kozikow.com/?center=gluon) or [turbo gears](http://clustering.kozikow.com/?center=tg) ended up close to django, but barely visible and without clusters of their own.
+
+#### 有趣的graphistry集群
+
+ * [Django cluster](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=djangocluster&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true)
+ * [Flask cluster](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=flashcluster&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true)
+ * [Twisted cluster](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=twistedcluster&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true)
+
+### 其他有趣的集群
+
+Looking at results of clustering algorithm, only "medium sized" clusters are
+interesting. A few first are obvious like clusters dominated by packages like
+os and sys. Very small clusters are not interesting either. [Here you can see clusters between positions 5 and 30 according to size](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=topclusters&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true).
+
+Some of the other clusters:
+
+ * Testing cluster, [d3 link](http://clustering.kozikow.com/?center=unittest), [graphistry cluster](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=testclusters&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true)
+ * Openstack cluster, [d3 link](http://clustering.kozikow.com/?center=nova), [graphistry link](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=stackcluster&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true)
+ * [String parsing and formatting cluster](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=stringcluster&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true)
+ * [Robotics land](http://clustering.kozikow.com/?center=rospy)
+ * [gaming cluster](http://clustering.kozikow.com/?center=pygame)
+ * [deep learning cluster](https://labs.graphistry.com/graph/graph.html?dataset=PyGraphistry/5R2115KURX&type=vgraph&splashAfter=1468271796&info=true&static=true&contentKey=deepcluster&play=0¢er=true&menu=false&goLive=false&left=-1.44e+3&right=973&top=-478&bottom=657&poi=true)
+
+## 进一步分析的潜力
+
+### 其他编程语言
+
+Majority of the code is not specific to python. Only the first step, create a
+table with packages, is specific to python.
+
+I had to do a lot of work on fitting the parameters in Simulation parameters
+to make the graph look good enough. I suspect that I would have to do similar
+fitting to each language, as each language graph would have different
+properties.
+
+I will be working on analyzing Java and Scala next.
+
+### 搜索"打包X的替代品",例如,seaborn vs bokeh
+
+For example, it would be interesting to cluster together all python data
+visualization packages.
+
+Intuitively, such packages would be used in similar context, but would be
+rarely used together. Assuming that our graph is represented as npmi
+coincidence matrix M, for packages x and y, correlation of vectors x and y
+would be high, but M[x][y] would be low.
+
+Alternatively, `M^2 /. M` could have some potential. M^2 would roughly
+represent "two hops" in the graph, while `/.` is a pointwise division.
+
+e high correlation of their neighbor weights, but low direct edge.
+
+This would work in many situations, but there are some others it wouldn't
+handle well. Example case it wouldn't handle well:
+
+ * sqlalchemy is an alternative to django built-in ORM.
+ * django ORM is only used in django.
+ * django ORM is not well usable in other web frameworks like flask.
+ * other web frameworks make heavy use of flask ORM, but not django built-in ORM.
+
+Therefore, django ORM and sqlalchemy wouldn't have their neighbor weights
+correlated. I might got some ORM details wrong, as I don't do much web dev.
+
+I also plan to experiment with [node2vec](http://arxiv.org/abs/1607.00653) or
+squaring the adjacency matrix.
+
+### 在repo关系中
+
+Currently, I am only looking at imports within the same file. It could be
+interesting to look at the same graph built using "within same repository"
+relationship, or systematically compare the "within same repository" and
+"within same file" relationships.
+
+### 加入pypi
+
+It could be interesting to compare usages on github with pypi downloads. [Pypi is also accessible on BigQuery.](https://mail.python.org/pipermail/distutils-
+sig/2016-May/028986.html)
+
+## 数据
+
+ * [Post-processed JSON data used by d3](http://clustering.kozikow.com/graph.js)
+ * [Publicly available BigQuery tables with all the data](https://bigquery.cloud.google.com/dataset/wide-silo-135723:github_clustering). See Reproduce section to see how each table was generated.
+
+## 重现步骤
+
+### 从BigQuery抽取数据
+
+#### 创建一个包表
+
+Save to wide-silo-135723:github_clustering.packages_in_file_py:
+
+```python
+
+ SELECT
+ id,
+ NEST(UNIQUE(COALESCE(
+ REGEXP_EXTRACT(line, r"^from ([a-zA-Z0-9_-]+).*import"),
+ REGEXP_EXTRACT(line, r"^import ([a-zA-Z0-9_-]+)")))) AS package
+ FROM (
+ SELECT
+ id AS id,
+ LTRIM(SPLIT(content, "\n")) AS line,
+ FROM
+ [fh-bigquery:github_extracts.contents_py]
+ HAVING
+ line CONTAINS "import")
+ GROUP BY id
+ HAVING LENGTH(package) > 0;
+
+```
+
+Table will have two fields - id representing the file and repeated field with
+packages in the single file. Repeated fields are like arrays - [the best description of repeated fields I found.](http://stackoverflow.com/questions/32020714/what-does-repeated-field-in-google-bigquery-mean)
+
+This is the only step that is specific for python.
+
+#### 验证packages_in_file_py表
+
+Check that imports have been correctly parsed out from some [random file](https://github.com/sunzhxjs/JobGIS/blob/master/lib/python2.7/site-packages/pandas/core/format.py).
+
+```python
+
+ SELECT
+ GROUP_CONCAT(package, ", ") AS packages,
+ COUNT(package) AS count
+ FROM [wide-silo-135723:github_clustering.packages_in_file_py]
+ WHERE id == "009e3877f01393ae7a4e495015c0e73b5aa48ea7"
+
+```
+
+packages | count
+---|---
+distutils, itertools, numpy, decimal, pandas, csv, warnings, future, IPython,
+math, locale, sys | 12
+
+#### 过滤掉不常用的包
+
+```python
+
+ SELECT
+ COUNT(DISTINCT(package))
+ FROM (SELECT
+ package,
+ count(id) AS count
+ FROM [wide-silo-135723:github_clustering.packages_in_file_py]
+ GROUP BY 1)
+ WHERE count > 200;
+
+```
+
+There are 3501 packages with at least 200 occurrences and it seems like a fine
+cut off point. Create a filtered table, wide-
+silo-135723:github_clustering.packages_in_file_top_py:
+
+```python
+
+ SELECT
+ id,
+ NEST(package) AS package
+ FROM (SELECT
+ package,
+ count(id) AS count,
+ NEST(id) AS id
+ FROM [wide-silo-135723:github_clustering.packages_in_file_py]
+ GROUP BY 1)
+ WHERE count > 200
+ GROUP BY id;
+
+```
+
+Results are in [wide-silo-135723:github_clustering.packages_in_file_top_py].
+
+```python
+
+ SELECT
+ COUNT(DISTINCT(package))
+ FROM [wide-silo-135723:github_clustering.packages_in_file_top_py];
+
+```
+
+```python
+
+ 3501
+
+```
+
+#### 生成图形的边
+
+I will generate edges and save it to table wide-
+silo-135723:github_clustering.packages_in_file_edges_py.
+
+```python
+
+ SELECT
+ p1.package AS package1,
+ p2.package AS package2,
+ COUNT(*) AS count
+ FROM (SELECT
+ id,
+ package
+ FROM FLATTEN([wide-silo-135723:github_clustering.packages_in_file_top_py], package)) AS p1
+ JOIN
+ (SELECT
+ id,
+ package
+ FROM [wide-silo-135723:github_clustering.packages_in_file_top_py]) AS p2
+ ON (p1.id == p2.id)
+ GROUP BY 1,2
+ ORDER BY count DESC;
+
+```
+
+Top 10 edges:
+
+```python
+
+ SELECT
+ package1,
+ package2,
+ count AS count
+ FROM [wide-silo-135723:github_clustering.packages_in_file_edges_py]
+ WHERE package1 < package2
+ ORDER BY count DESC
+ LIMIT 10;
+
+```
+
+package1 | package2 | count
+---|---|---
+os | sys | 393311
+os | re | 156765
+os | time | 156320
+logging | os | 134478
+sys | time | 133396
+re | sys | 122375
+__future__ | django | 119335
+__future__ | os | 109319
+os | subprocess | 106862
+datetime | django | 94111
+
+#### 过滤掉不相关的边
+
+Quantiles of the edge weight:
+
+```python
+
+ SELECT
+ GROUP_CONCAT(STRING(QUANTILES(count, 11)), ", ")
+ FROM [wide-silo-135723:github_clustering.packages_in_file_edges_py];
+
+```
+
+```python
+
+ 1, 1, 1, 2, 3, 4, 7, 12, 24, 70, 1005020
+
+```
+
+In my first implementation I filtered edges out based on the total count. It
+was not a good approach, as a small relationship between two big packages was
+more likely to stay than strong relationship between too small packages.
+
+Create wide-silo-135723:github_clustering.packages_in_file_nodes_py:
+
+```python
+
+ SELECT
+ package AS package,
+ COUNT(id) AS count
+ FROM [github_clustering.packages_in_file_top_py]
+ GROUP BY 1;
+
+```
+
+package | count
+---|---
+os | 1005020
+sys | 784379
+django | 618941
+__future__ | 445335
+time | 359073
+re | 349309
+
+Create the table packages_in_file_edges_top_py:
+
+```python
+
+ SELECT
+ edges.package1 AS package1,
+ edges.package2 AS package2,
+ # WordPress gets confused by less than sign after nodes1.count
+ edges.count / IF(nodes1.count nodes2.count,
+ nodes1.count,
+ nodes2.count) AS strength,
+ edges.count AS count
+ FROM [wide-silo-135723:github_clustering.packages_in_file_edges_py] AS edges
+ JOIN [wide-silo-135723:github_clustering.packages_in_file_nodes_py] AS nodes1
+ ON edges.package1 == nodes1.package
+ JOIN [wide-silo-135723:github_clustering.packages_in_file_nodes_py] AS nodes2
+ ON edges.package2 == nodes2.package
+ HAVING strength > 0.33
+ AND package1 <= package2;
+
+```
+
+[Full results in google docs.](https://docs.google.com/spreadsheets/d/1hbQAIyDUigIsEajcpNOXbmldgfLmEqsOE729SPTVpmA/edit?usp=sharing)
+
+### Process data with Pandas to json
+
+#### 加载csv,并用pandas验证边
+
+import pandas as pd
+import math
+
+df = pd.read_csv("edges.csv")
+pd_df = df[( df.package1 == "pandas" ) | ( df.package2 == "pandas" )]
+pd_df.loc[pd_df.package1 == "pandas","other_package"] = pd_df[pd_df.package1
+== "pandas"].package2
+pd_df.loc[pd_df.package2 == "pandas","other_package"] = pd_df[pd_df.package2
+== "pandas"].package1
+
+df_to_org(pd_df.loc[:,["other_package", "count"]])
+
+print "\n", len(pd_df), "total edges with pandas"
+
+other_package | count
+---|---
+pandas | 33846
+numpy | 21813
+statsmodels | 1355
+seaborn | 1164
+zipline | 684
+11 more rows |
+
+16 total edges with pandas
+
+#### DataFrame with nodes
+
+nodes_df = df[df.package1 == df.package2].reset_index().loc[:, ["package1",
+"count"]].copy()
+nodes_df["label"] = nodes_df.package1
+nodes_df["id"] = nodes_df.index
+nodes_df["r"] = (nodes_df["count"] / nodes_df["count"].min()).apply(math.sqrt)
++ 5
+nodes_df["count"].apply(lambda s: str(s) + " total usages\n")
+df_to_org(nodes_df)
+
+package1 | count | label | id | r
+---|---|---|---|---
+os | 1005020 | os | 0 | 75.711381704
+sys | 784379 | sys | 1 | 67.4690570169
+django | 618941 | django | 2 | 60.4915169887
+__future__ | 445335 | __future__ | 3 | 52.0701286903
+time | 359073 | time | 4 | 47.2662138808
+3460 more rows | | | |
+
+#### Create map of node name -> id
+
+id_map = nodes_df.reset_index().set_index("package1").to_dict()["index"]
+
+print pd.Series(id_map).sort_values()[:5]
+
+```python
+
+ os 0
+ sys 1
+ django 2
+ __future__ 3
+ time 4
+ dtype: int64
+
+```
+
+#### Create edges data frame
+
+edges_df = df.copy()
+edges_df["source"] = edges_df.package1.apply(lambda p: id_map[p])
+edges_df["target"] = edges_df.package2.apply(lambda p: id_map[p])
+edges_df = edges_df.merge(nodes_df[["id", "count"]], left_on="source",
+right_on="id", how="left")
+edges_df = edges_df.merge(nodes_df[["id", "count"]], left_on="target",
+right_on="id", how="left")
+df_to_org(edges_df)
+
+print "\ndf and edges_df should be the same length: ", len(df), len(edges_df)
+
+package1 | package2 | strength | count_x | source | target | id_x | count_y |id_y | count
+---|---|---|---|---|---|---|---|---|---
+os | os | 1.0 | 1005020 | 0 | 0 | 0 | 1005020 | 0 | 1005020
+sys | sys | 1.0 | 784379 | 1 | 1 | 1 | 784379 | 1 | 784379
+django | django | 1.0 | 618941 | 2 | 2 | 2 | 618941 | 2 | 618941
+__future__ | __future__ | 1.0 | 445335 | 3 | 3 | 3 | 445335 | 3 | 445335
+os | sys | 0.501429793505 | 393311 | 0 | 1 | 0 | 1005020 | 1 | 784379
+11117 more rows | | | | | | | | |
+
+df and edges_df should be the same length: 11122 11122
+
+#### Add reversed edge
+
+edges_rev_df = edges_df.copy()
+edges_rev_df.loc[:,["source", "target"]] = edges_rev_df.loc[:,["target",
+"source"]].values
+edges_df = edges_df.append(edges_rev_df)
+df_to_org(edges_df)
+
+package1 | package2 | strength | count_x | source | target | id_x | count_y |id_y | count
+---|---|---|---|---|---|---|---|---|---
+os | os | 1.0 | 1005020 | 0 | 0 | 0 | 1005020 | 0 | 1005020
+sys | sys | 1.0 | 784379 | 1 | 1 | 1 | 784379 | 1 | 784379
+django | django | 1.0 | 618941 | 2 | 2 | 2 | 618941 | 2 | 618941
+__future__ | __future__ | 1.0 | 445335 | 3 | 3 | 3 | 445335 | 3 | 445335
+os | sys | 0.501429793505 | 393311 | 0 | 1 | 0 | 1005020 | 1 | 784379
+22239 more rows | | | | | | | | |
+
+#### Truncate edges DataFrame
+
+edges_df = edges_df[["source", "target", "strength"]]
+df_to_org(edges_df)
+
+source | target | strength
+---|---|---
+0.0 | 0.0 | 1.0
+1.0 | 1.0 | 1.0
+2.0 | 2.0 | 1.0
+3.0 | 3.0 | 1.0
+0.0 | 1.0 | 0.501429793505
+22239 more rows | |
+
+#### After running simulation in the browser, get saved positions
+
+The whole simulation takes a minute to stabilize. I could just download an
+image, but there are extra features like pressing the node opens pypi.
+
+Download all positions after the simulation from the javascript console:
+
+```python
+
+ var positions = nodes.map(function bar (n) { return [n.id, n.x, n.y]; })
+ JSON.stringify()
+
+```
+
+Join the positions x and y with edges dataframe, so they will get picked up by
+the d3.
+
+pos_df = pd.read_json("fixed-positions.json")
+pos_df.columns = ["id", "x", "y"]
+nodes_df = nodes_df.merge(pos_df, on="id")
+
+#### Truncate nodes DataFrame
+
+# c will be collision strength. Prevent labels from overlaping.
+nodes_df["c"] = pd.DataFrame([nodes_df.label.str.len() * 1.8,
+nodes_df.r]).max() + 5
+nodes_df = nodes_df[["id", "r", "label", "c", "x", "y"]]
+df_to_org(nodes_df)
+
+id | r | label | c | x | y
+---|---|---|---|---|---
+0 | 75.711381704 | os | 80.711381704 | 158.70817237 | 396.074393369
+1 | 67.4690570169 | sys | 72.4690570169 | 362.371142521 | -292.138913114
+2 | 60.4915169887 | django | 65.4915169887 | 526.471326062 | 1607.83507287
+3 | 52.0701286903 | __future__ | 57.0701286903 | 1354.91212894 | 680.325432179
+4 | 47.2662138808 | time | 52.2662138808 | 419.407448663 | 439.872927665
+3460 more rows | | | | |
+
+#### 保存文件到json
+
+# Truncate columns
+with open("graph.js", "w") as f:
+f.write("var nodes = {}\n\n".format(nodes_df.to_dict(orient="records")))
+f.write("var nodeIds = {}\n".format(id_map))
+f.write("var links = {}\n\n".format(edges_df.to_dict(orient="records")))
+
+### 使用新的d3速度verlet集成算法绘制图
+
+The physical simulation Simulation uses the new [velocity verlet integration force graph in d3 v 4.0.](https://github.com/d3/d3/blob/master/API.md#forces-d3-force) Simulation
+takes about one minute to stabilize, so for viewing purposes I hard-coded the
+position of node after running simulation on my machine.
+
+The core component of the simulation is:
+
+```python
+
+ var simulation = d3.forceSimulation(nodes)
+ .force("charge", d3.forceManyBody().strength(-400))
+ .force("link", d3.forceLink(links).distance(30).strength(function (d) {
+ return d.strength * d.strength;
+ }))
+ .force("collide", d3.forceCollide().radius(function(d) {
+ return d.c;
+ }).strength(5))
+ .force("x", d3.forceX().strength(0.1))
+ .force("y", d3.forceY().strength(0.1))
+ .on("tick", ticked);
+
+```
+
+To re-run the simulation you can:
+
+ * Remove fixed positions added in one of pandas processing steps.
+ * Uncomment the "forces" in the [javascript file.](https://github.com/kozikow/kozikow-blog/blob/master/clustering/index2.js#L2)
+
+#### 仿真参数
+
+I have been tweaking simulation parameters for a while. Very dense "center" of
+the graph is in conflict with clusters on the edge of the graph.
+
+As you may see in the current graph, nodes in the center sometimes overlap,
+while distance between nodes on the edge of a graph is big.
+
+I got as much as I could from the collision parameter and increasing it
+further wasn't helpful. Potentially I could increase gravity towards the
+center, but then some of the valuable "clusters" from edges of the graph got
+lumped into the big "kernel" in the center.
+
+Plotting some big clusters separately worked well to solve this problem.
+
+ * 引力
+
+ * 包A和包B之间的边权重: ,距离为30
+ * 向心重力:0.1
+
+ * 斥力
+
+ * 节点间的斥力: -400
+ * 节点的碰撞强度:5
diff --git "a/\344\275\240\350\257\245\347\233\270\344\277\241\350\260\201\347\232\204\350\257\204\345\210\206\357\274\237IMDB\357\274\214\347\203\202\347\225\252\350\214\204\357\274\214Metacritic\357\274\214\350\277\230\346\230\257Fandango\357\274\237.md" "b/\344\275\240\350\257\245\347\233\270\344\277\241\350\260\201\347\232\204\350\257\204\345\210\206\357\274\237IMDB\357\274\214\347\203\202\347\225\252\350\214\204\357\274\214Metacritic\357\274\214\350\277\230\346\230\257Fandango\357\274\237.md"
new file mode 100644
index 0000000..d761113
--- /dev/null
+++ "b/\344\275\240\350\257\245\347\233\270\344\277\241\350\260\201\347\232\204\350\257\204\345\210\206\357\274\237IMDB\357\274\214\347\203\202\347\225\252\350\214\204\357\274\214Metacritic\357\274\214\350\277\230\346\230\257Fandango\357\274\237.md"
@@ -0,0 +1,3 @@
+原文:[Whose ratings should you trust? IMDB, Rotten Tomatoes, Metacritic, or Fandango?](https://medium.freecodecamp.com/whose-reviews-should-you-trust-imdb-rotten-tomatoes-metacritic-or-fandango-7d1010c6cf19)
+
+---
\ No newline at end of file
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/README.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/README.md"
new file mode 100644
index 0000000..beff5db
--- /dev/null
+++ "b/\350\257\273\344\271\246\347\254\224\350\256\260/README.md"
@@ -0,0 +1,2 @@
+* [官方 python 教程(python 3)](https://docs.python.org/zh-cn/3/tutorial/index.html) | [思维导图](http://naotu.baidu.com/file/a50273db8bba3a2c7f257f039d06e7c1?token=f389efed04f2e4b8)
+*
\ No newline at end of file
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/1. Python \346\225\260\346\215\256\346\250\241\345\236\213.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/1. Python \346\225\260\346\215\256\346\250\241\345\236\213.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/10.\345\272\217\345\210\227\347\232\204\344\277\256\346\224\271\343\200\201\346\225\243\345\210\227\345\222\214\345\210\207\347\211\207.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/10.\345\272\217\345\210\227\347\232\204\344\277\256\346\224\271\343\200\201\346\225\243\345\210\227\345\222\214\345\210\207\347\211\207.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/11.\346\216\245\345\217\243\357\274\232\344\273\216\345\215\217\350\256\256\345\210\260\346\212\275\350\261\241\345\237\272\347\261\273.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/11.\346\216\245\345\217\243\357\274\232\344\273\216\345\215\217\350\256\256\345\210\260\346\212\275\350\261\241\345\237\272\347\261\273.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/12.\347\273\247\346\211\277\347\232\204\344\274\230\347\274\272\347\202\271.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/12.\347\273\247\346\211\277\347\232\204\344\274\230\347\274\272\347\202\271.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/13.\346\255\243\347\241\256\351\207\215\350\275\275\350\277\220\347\256\227\347\254\246.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/13.\346\255\243\347\241\256\351\207\215\350\275\275\350\277\220\347\256\227\347\254\246.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/14. \345\217\257\345\217\240\346\210\264\347\232\204\345\257\271\350\261\241\343\200\201\350\277\255\344\273\243\345\231\250\345\222\214\347\224\237\346\210\220\345\231\250.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/14. \345\217\257\345\217\240\346\210\264\347\232\204\345\257\271\350\261\241\343\200\201\350\277\255\344\273\243\345\231\250\345\222\214\347\224\237\346\210\220\345\231\250.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/15. \344\270\212\344\270\213\346\226\207\347\256\241\347\220\206\345\231\250\345\222\214 else \345\235\227.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/15. \344\270\212\344\270\213\346\226\207\347\256\241\347\220\206\345\231\250\345\222\214 else \345\235\227.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/16. \345\215\217\347\250\213.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/16. \345\215\217\347\250\213.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/17. \344\275\277\347\224\250\346\234\237\347\211\251\345\244\204\347\220\206\345\271\266\345\217\221.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/17. \344\275\277\347\224\250\346\234\237\347\211\251\345\244\204\347\220\206\345\271\266\345\217\221.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/18. \344\275\277\347\224\250 asyncio \345\214\205\345\244\204\347\220\206\345\271\266\345\217\221.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/18. \344\275\277\347\224\250 asyncio \345\214\205\345\244\204\347\220\206\345\271\266\345\217\221.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/19. \345\212\250\346\200\201\345\261\236\346\200\247\345\222\214\347\211\271\346\200\247.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/19. \345\212\250\346\200\201\345\261\236\346\200\247\345\222\214\347\211\271\346\200\247.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/2. \345\272\217\345\210\227\346\236\204\346\210\220\347\232\204\346\225\260\347\273\204.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/2. \345\272\217\345\210\227\346\236\204\346\210\220\347\232\204\346\225\260\347\273\204.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/20. \345\261\236\346\200\247\346\217\217\350\277\260\347\254\246.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/20. \345\261\236\346\200\247\346\217\217\350\277\260\347\254\246.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/21. \347\261\273\345\205\203\347\274\226\347\250\213.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/21. \347\261\273\345\205\203\347\274\226\347\250\213.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/3. \345\255\227\345\205\270\345\222\214\351\233\206\345\220\210.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/3. \345\255\227\345\205\270\345\222\214\351\233\206\345\220\210.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/4. \346\226\207\346\234\254\345\222\214\345\255\227\350\212\202\345\272\217\345\210\227.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/4. \346\226\207\346\234\254\345\222\214\345\255\227\350\212\202\345\272\217\345\210\227.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/5.\344\270\200\347\255\211\345\207\275\346\225\260.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/5.\344\270\200\347\255\211\345\207\275\346\225\260.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/6.\344\275\277\347\224\250\344\270\200\347\255\211\345\207\275\346\225\260\345\256\236\347\216\260\350\256\276\350\256\241\346\250\241\345\274\217.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/6.\344\275\277\347\224\250\344\270\200\347\255\211\345\207\275\346\225\260\345\256\236\347\216\260\350\256\276\350\256\241\346\250\241\345\274\217.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/7.\345\207\275\346\225\260\350\243\205\351\245\260\345\231\250\345\222\214\351\227\255\345\214\205.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/7.\345\207\275\346\225\260\350\243\205\351\245\260\345\231\250\345\222\214\351\227\255\345\214\205.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/8.\345\257\271\350\261\241\345\274\225\347\224\250\343\200\201\345\217\257\345\217\230\346\200\247\345\222\214\345\236\203\345\234\276\345\233\236\346\224\266.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/8.\345\257\271\350\261\241\345\274\225\347\224\250\343\200\201\345\217\257\345\217\230\346\200\247\345\222\214\345\236\203\345\234\276\345\233\236\346\224\266.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/9.\347\254\246\345\220\210 python \351\243\216\346\240\274\347\232\204\345\257\271\350\261\241.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/9.\347\254\246\345\220\210 python \351\243\216\346\240\274\347\232\204\345\257\271\350\261\241.md"
new file mode 100644
index 0000000..e69de29
diff --git "a/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/README.md" "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/README.md"
new file mode 100644
index 0000000..80a56e4
--- /dev/null
+++ "b/\350\257\273\344\271\246\347\254\224\350\256\260/\346\265\201\347\225\205\347\232\204 python/README.md"
@@ -0,0 +1,9 @@
+# 流畅的 python
+
+### 本书源代码
+地址:[fluentpython/example-code](https://github.com/fluentpython/example-code)
+
+使用方式:
+```
+$ python3 -m doctest example_script.py
+```
\ No newline at end of file