Hands-On Financial Trading with Python
A practical guide to using Zipline and other Python libraries for backtesting trading strategies
Jiri Pik and Sourav Ghosh
Table of Contents
Preface
Section 1: Introduction to Algorithmic Trading
1
Introduction to Algorithmic Trading
Walking through the evolution of algorithmic trading 4
Understanding financial asset classes 6
Going through the modern electronic trading exchange 7
Order types 7
Limit order books 8
The exchange matching engine 9
Understanding the components of an algorithmic trading system 9
The core infrastructure of an algorithmic trading system 10
The quantitative infrastructure of an algorithmic trading system 11
Summary 16
Section 2: In-Depth Look at Python Libraries for the Analysis of Financial Datasets
2
Exploratory Data Analysis in Python
Technical requirements 19
Introduction to EDA 20
Steps in EDA 20
Revelation of the identity of A, B, and C and EDA’s conclusions 41
Special Python libraries for EDA 42
Summary 44
3
High-Speed Scientific Computing Using NumPy
Technical requirements 46
Introduction to NumPy 46
Creating NumPy ndarrays 46
Creating 1D ndarrays 46
Creating 2D ndarrays 47
Creating any-dimension ndarrays 47
Creating an ndarray with np.zeros(…) 48
Creating an ndarray with np.ones(…) 48
Creating an ndarray with np.identity(…) 49
Creating an ndarray with np.arange(…) 49
Creating an ndarray with np.random. randn(…) 49
Data types used with NumPy ndarrays 50
Creating a numpy.float64 array 50
Creating a numpy.bool array 50
ndarrays’ dtype attribute 51
Converting underlying data types of ndarray with numpy.ndarrays.astype(…) 51
Indexing of ndarrays 51
Direct access to an ndarray’s element 52
ndarray slicing 53
Boolean indexing 56
Indexing with arrays 58
Basic ndarray operations 59
Scalar multiplication with an ndarray 59
Linear combinations of ndarrays 59
Exponentiation of ndarrays 59
Addition of an ndarray with a scalar 60
Transposing a matrix 60
Changing the layout of an ndarray 60
Finding the minimum value in an ndarray 61
Calculating the absolute value 61
Calculating the mean of an ndarray 62
Finding the index of the maximum value in an ndarray 62
Calculating the cumulative sum of elements of an ndarray 63
Finding NaNs in an ndarray 63
Finding the truth values of x1>x2 of two ndarrays 64
any and all Boolean operations on ndarrays 65
Sorting ndarrays 66
Searching within ndarrays 68
File operations on ndarrays 69
File operations with text files 69
File operations with binary files 70
Summary 71
4
Data Manipulation and Analysis with pandas
Introducing pandas Series, pandas DataFrames, and pandas Indexes 74
pandas.Series 74
pandas.DataFrame 76
pandas.Index 79
Learning essential pandas. DataFrame operations 80
Indexing, selection, and filtering of DataFrames 80
Dropping rows and columns from a DataFrame 82
Sorting values and ranking the values’ order within a DataFrame 84
Arithmetic operations on DataFrames 86
Merging and combining multiple DataFrames into a single DataFrame 88
Hierarchical indexing 91
Grouping operations in DataFrames 94
Transforming values in DataFrames’ axis indices 97
Handling missing data in DataFrames 98
The transformation of DataFrames with functions and mappings 101
Discretization/bucketing of DataFrame values 102
Permuting and sampling DataFrame values to generate new DataFrames 104
Exploring file operations with pandas.DataFrames 106
CSV files 106
JSON files 108
Summary 109
5
Data Visualization Using Matplotlib
Technical requirements 112
Creating figures and subplots 112
Defining figures’ subplots 112
Plotting in subplots 113
Enriching plots with colors, markers, and line styles 116
Enriching axes with ticks, labels, and legends 118
Enriching data points with annotations 120
Saving plots to files 123
Charting a pandas DataFrame with Matplotlib 124
Creating line plots of a DataFrame column 125
Creating bar plots of a DataFrame column 126
Creating histogram and density plots of a DataFrame column 128
Creating scatter plots of two DataFrame columns 130
Plotting time series data 133
Summary 144
6
Statistical Estimation, Inference, and Prediction
Technical requirements 146
Introduction to statsmodels 146
Normal distribution test with Q-Q plots 146
Time series modeling with statsmodels 148
ETS analysis of a time series 149
Augmented Dickey-Fuller test for stationarity of a time series 157
Autocorrelation and partial autocorrelation of a time series 159
ARIMA time series model 161
Using a SARIMAX time series model with pmdarima 166
Time series forecasting with Facebook’s Prophet library 171
Introduction to scikit-learn regression and classification 174
Generating the dataset 174
Running RidgeCV regression on the dataset 178
Running a classification method on the dataset 182
Summary 186
Section 3: Algorithmic Trading in Python
7
Financial Market Data Access in Python
Exploring the yahoofinancials Python library 190
Single-ticker retrieval 191
Multiple-tickers retrieval 198
Exploring the pandas_ datareader Python library 201
Access to Yahoo Finance 202
Access to EconDB 203
Access to the Federal Reserve Bank of St Louis’ FRED 204
Caching queries 205
Exploring the Quandl data source 206
Exploring the IEX Cloud data source 207
Exploring the MarketStack data source 209
Summary 211
8
Introduction to Zipline and PyFolio
Introduction to Zipline and PyFolio 214
Installing Zipline and PyFolio 214
Installing Zipline 214
Installing PyFolio 215
Importing market data into a Zipline/PyFolio backtesting system 215
Importing data from the historical Quandl bundle 215
Importing data from the CSV files bundle 218
Importing data from custom bundles 219
Structuring Zipline/PyFolio backtesting modules 229
Trading happens every day 230
Trading happens on a custom schedule 231
Reviewing the key Zipline API reference 233
Types of orders 233
Commission models 234
Slippage models 234
Running Zipline backtesting from the command line 235
Introduction to risk management with PyFolio 236
Market volatility, PnL variance, and PnL standard deviation 239
Trade-level Sharpe ratio 240
Maximum drawdown 242
Summary 244
9
Fundamental Algorithmic Trading Strategies
What is an algorithmic trading strategy? 246
Learning momentum-based/ trend-following strategies 248
Rolling window mean strategy 248
Simple moving averages strategy 254
Exponentially weighted moving averages strategy 259
RSI strategy 265
MACD crossover strategy 270
RSI and MACD strategies 276
Triple exponential average strategy 282
Williams R% strategy 287
Learning mean-reversion strategies 292
Bollinger band strategy 292
Pairs trading strategy 298
Learning mathematical model-based strategies 305
Minimization of the portfolio volatility strategy with monthly trading 305
Maximum Sharpe ratio strategy with monthly trading 312
Learning time series prediction-based strategies 317
SARIMAX strategy 318
Prophet strategy 323
Summary 328
Appendix A
How to Setup a Python Environment
Technical requirements 329
Initial setup 329
Downloading the complimentary Quandl data bundle 332
Other Books You May Enjoy
Index