{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Introduction to Python for Scientific Computing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Quentin CAUDRON, Princeton University" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Chapter 1. Introduction and Setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python is a *general-purpose*, object-oriented programming language. It is dynamically-typed and interpreted. It has a thriving community of developers, especially in science. This means that you never need to reinvent the wheel - just about anything you may want to do has been ( or is being ) implemented, in a robust, computationally-efficient, and user-friendly way.\n", "\n", "Python emphasises *legibility* and *clarity* in its coding. It offers a very large standard library, and the typical \"scientific Python stack\", which includes Numpy, Scipy, and Matplotlib, allow you to go very far." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**The Zen of Python**\n", "\n", "- Beautiful is better than ugly.\n", "- Explicit is better than implicit.\n", "- Simple is better than complex.\n", "- Complex is better than complicated.\n", "- Readability counts.\n", "\n", "Python is a highly modular language, allowing you to easily pull in *modules* and *packages*, and to write your own." ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Compared to Other Languages" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**C, C++, and Fortran**\n", "\n", "These languages are very fast, and great for heavy computations. However, they're slow and painful to write - there's no interactivity, the syntax gets complicated, and have manual memory management." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**R**\n", "\n", "A tool for advanced statistics, but the language is exactly that : a *tool* aimed at stats. It's not very good for general-purpose coding. I have a strange aversion to it. Hey, at least it's free and open-source." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Matlab and Octave**\n", "\n", "Matlab has a great development environment, and a huge number of optimised, implemented toolsets. It's very expensive though. Octave is a great free clone, but it's not as pleasant to use." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**So, Python ?**\n", "\n", "Huge range of scientific tools - nonlinear function fitting, MCMC, spectral analysis, ODEs and PDEs, signal and image processing, great data science tools. Vast community, active development, and much higher quality than R due to the way the language is developed and the way we code it. It's **batteries-included**. Downsides are that the IDE as shiny as Matlab, but I'm well over it - IPython Notebooks ( we'll see later ) are win." ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "The Python Scientific Stack" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python's standard library is huge. Still, as scientists, we require some fairly specific things that pure programmers might not immediately need : reasonable vector notation and manipulation, matrix and linear algebra, optimisation, interpolation, random numbers and statistical functions, plotting, etc.. The \"standard Python stack\" puts together a few modules and extensions to Python to give us these.\n", "\n", "- **Numpy** : arrays, matrices and their operations, random numbers, logical operations, ...\n", "- **Scipy** : linear algebra, symbolic operations, signal tools, optimisation, ...\n", "- **Matplotlib** : plotting\n", "- **IPython** : interactivity, Notebooks ( like this one ), ...\n", "\n", "Then, you may need more specific tools, like a good MCMC sampler ( PyMC ), or constrained, nonlinear function fitting ( lmfit ), or machine learning algorithms ( scikit-learn ), image processing ( scikit-image ), etc.. The list goes on !\n", "\n", "In this tutorial, we'll get set up with the basic Python stack and explore how they work. Then we'll demo some of these other packages for fun." ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Recommended Installs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Distribution**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I find that **Anaconda** is a great distribution. It comes will most of the packages you'll need, and a great command-line package manager to help keep them up to date and install others. If you want to grab the faster distro ( free for academics ), head to **`store.continuum.io/cshop/academicanaconda`** and register with an @blah**.edu** address to grab an academic license. If you can't be bothered with that, it's at **`continuum.com/downloads`**.\n", "\n", "1. Linux : just run the .sh\n", "2. OSX : just run the .dmg. Might need to select \"install for me only\" if you don't have admin rights.\n", "3. Windows : just run the .exe\n", "\n", "If you don't want Anaconda, you can install Python on its own, and then add packages and modules as you need them. I won't be covering this in the interest of time. If you're under Linux, use your package manager; if you're under OSX, then HomeBrew has what you need. If you're running Windows and are interested in installing everything yourself, cry a little and then head to **`python.org/getit`**.\n", "\n", "A note on what we're installing : Anaconda comes with Python 2.7. This is by far the most widespread version of Python. It goes by Python 2 for short. There has been, for years, a Python 3, but it isn't backwards compatible, and whilst many of the main scientific packages are working to fix that, then the vast majority of scientists use Python 2 ( actually, just about everyone : the Python dudes themselves recommend starting with 2.7, due to compatibility; this is changing however, as more packages move towards Py3 compatibility )." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Front-End / Interface**\n", "\n", "Here, we have several options. With Python now installed, you just need an environment to write it in. You could use a standard text editor :\n", "\n", "1. Linux : you know what you're doing - emacs and vim work, or something graphical like Eclipse\n", "2. OSX : Sublime if you'd like a recommendation for a text editor\n", "3. Windows : Uhhh... Notepad++ ?\n", "\n", "Then, you could go with something more advanced, like an IDE ( integrated development environment ). An IDE is a one-stop shop to write and run your code. For Python, I like Spyder. It's available for all three of the above OSs. If you've installed Anaconda, you already have Spyder. Linux and OSX users, call `spyder` from the command line. Windows users can call the Anaconda Launcher, and you'll have it there.\n", "\n", "Finally, there's my favourite way to write Python *for development* : the IPython Notebook. If you installed Anaconda, you already have IPython. Otherwise, go get it, you won't regret it. The IPython Notebook concept will be familiar to you if you've used Mathematica, and some aspects of Matlab ( though in Matlab, it's not done so well ). You have *cells* in which you write code, and you can execute cells independently. With a quick command-line hack, you can even get plots inline, such that all plots show up under the relevant cells. To call IPython Notebooks, OSX and Linux users can just call `ipython notebook` from the command line, and Windows users with Anaconda have an IPython Notebook shortcut in their Start Menu ( in theory ). For inline plots and sexiness all around, I prefer calling `ipython notebook --pylab inline --script`. This adds two options : `--pylab inline` tells the plotting to happen under each cell rather than in separate windows, and `--script` tells IPython to also save a `.py` as well as the `.ipynb` extension, so you can just run your code from the command line or on a remote computer if you want to. Linux and OSX users can write this as an alias if they want : drop the line \n", "\n", "`alias pynb='ipython notebook --pylab inline --script` \n", "\n", "in `~/.bashrc`, and Windows users can edit the shortcut to their IPython Notebook to get the same result." ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "OK, so..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In theory, we have now correctly installed a working Python setup, and have some useful way of writing Python code. Next chapter : writing and running basic Python. Just a quick note first : how do you actually *run* the code ?\n", "\n", "0. Just call `python` from the command line. Not generally recommended - `ipython` adds useful stuff like tab-completion, for instance.\n", "\n", "1. Text editors : you save the file and call Python from the command line, giving it your filename as argument.\n", "\n", "2. IDE ( like Spyder ) : there are two boxes in which you can type. One is like a text editor, and allows you to write down the code that works when you find it. The other is interactive, and will run the code immediately. \n", "\n", "3. IPython Notebooks : Shift + Enter will run the cell you're in. CTRL + Enter will run the current cell and *not* move onto the next one. Then there are the menus at the top, where you can run all cells and such.\n" ] } ], "metadata": {} } ] }