Stock Market Prediction

stock market prediction tools and stock market prediction techniques and stock market prediction technical analysis and stock market prediction formula
ZiaAhuja Profile Pic
ZiaAhuja,Canada,Professional
Published Date:17-07-2017
Your Website URL(Optional)
Comment
Stock Market Prediction Student Name: Mark Dunne Student ID: 111379601 Supervisor: Derek Bridge Second Reader: Gregory ProvanDeclaration of Originality In signing this declaration, you are confirming, in writing, that the submit- ted work is entirely your own original work, except where clearly attributed otherwise, and that it has not been submitted partly or wholly for any other educational award. I hereby declare that: • This is all my own work, unless clearly indicated otherwise, with full and proper accreditation; • With respect to my own work: none of it has been submitted at any educational institution contributing in any way towards an educational award; • Withrespecttoanother’swork: alltext, diagrams, code, orideas, whether verbatim, paraphrased or otherwise modified or adapted, have been duly attributed to the source in a scholarly manner, whether from books, pa- pers, lecture notes or any other student’s work, whether published or unpublished, electronically or in print. Name: Mark Dunne Signed: Date: 2Abstract In this report we analyse existing and new methods of stock market predic- tion. We take three different approaches at the problem: Fundamental analysis, Technical Analysis, and the application of Machine Learning. We find evidence in support of the weak form of the Efficient Market Hypothesis, that the his- toric price does not contain useful information but out of sample data may be predictive. We show that Fundamental Analysis and Machine Learning could be used to guide an investor’s decisions. We demonstrate a common flaw in Technical Analysis methodology and show that it produces limited useful infor- mation. Based on our findings, algorithmic trading programs are developed and simulated using Quantopian.Contents 1 Introduction 3 1.1 Project Goals and Scope . . . . . . . . . . . . . . . . . . . . . . . 3 2 Considerations in Approaching the Problem 5 2.1 Random Walk Hypothesis . . . . . . . . . . . . . . . . . . . . . . 5 2.1.1 Qualitative Similarity to Random pattern . . . . . . . . . 5 2.1.2 Quantitative Difference to Random pattern . . . . . . . . 7 2.2 Efficient Market Hypothesis . . . . . . . . . . . . . . . . . . . . . 8 2.3 Self Defeating Strategies . . . . . . . . . . . . . . . . . . . . . . . 9 2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 Review of Existing Work 10 3.1 Article 1 - Kara et al. 10 . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Article 2 - Shen et al. 19 . . . . . . . . . . . . . . . . . . . . . . 12 4 Data and Tools 14 4.1 Data Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.1.1 Choosing the Dataset . . . . . . . . . . . . . . . . . . . . 14 4.1.2 Gathering the Datasets . . . . . . . . . . . . . . . . . . . 14 4.1.3 Limitations of the Data . . . . . . . . . . . . . . . . . . . 16 4.2 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 5 Attacking the Problem - Fundamental Analysis 18 5.1 Price to Earnings Ratio . . . . . . . . . . . . . . . . . . . . . . . 19 5.2 Price to Book Ratio . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.3 Limitations of Fundamental Analysis . . . . . . . . . . . . . . . . 22 5.4 Fundamental Analysis - Conclusion . . . . . . . . . . . . . . . . . 22 6 Attacking the Problem - Technical Analysis 24 6.1 Broad Families of Technical Analysis Models . . . . . . . . . . . 24 6.2 Naive Trading patterns . . . . . . . . . . . . . . . . . . . . . . . . 24 6.3 Moving Average Crossover . . . . . . . . . . . . . . . . . . . . . . 26 6.3.1 Evaluating the Moving Average Crossover Model . . . . . 27 6.4 Additional Technical Analysis Models . . . . . . . . . . . . . . . 29 6.4.1 Evaluating the Indicators . . . . . . . . . . . . . . . . . . 30 16.4.2 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . 31 6.4.3 Error Estimation . . . . . . . . . . . . . . . . . . . . . . . 31 6.5 Common Problems with Technical Analysis . . . . . . . . . . . . 32 6.6 Technical Analysis - Conclusion . . . . . . . . . . . . . . . . . . . 33 7 Attacking the problem - Machine Learning 34 7.1 Preceding 5 day prices . . . . . . . . . . . . . . . . . . . . . . . . 34 7.1.1 Error Estimation . . . . . . . . . . . . . . . . . . . . . . . 35 7.1.2 Analysis of Model Failure . . . . . . . . . . . . . . . . . . 36 7.1.3 Preceeding 5 day prices - Conclusion . . . . . . . . . . . . 39 7.2 Related Assets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 7.2.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 7.2.2 Exploration of Feature Utility . . . . . . . . . . . . . . . . 40 7.2.3 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 7.2.4 Related Assets - Conclusion . . . . . . . . . . . . . . . . . 43 7.3 Analyst Opinions . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 7.3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 7.3.2 Data Exploration . . . . . . . . . . . . . . . . . . . . . . . 44 7.3.3 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . 45 7.3.4 Error Estimation . . . . . . . . . . . . . . . . . . . . . . . 47 7.3.5 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . 47 7.3.6 Analyst Opinions - Conclusion . . . . . . . . . . . . . . . 47 7.4 Disasters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 7.4.1 Data Preparation . . . . . . . . . . . . . . . . . . . . . . . 48 7.4.2 Predictive Value of Disasters . . . . . . . . . . . . . . . . 49 7.4.3 Disasters - Conclusion . . . . . . . . . . . . . . . . . . . . 50 8 Quantopian Trading Simulation 52 8.1 Simulation 1 - Related Assets . . . . . . . . . . . . . . . . . . . . 52 8.2 Simulation 2 - Analyst Opinions . . . . . . . . . . . . . . . . . . 54 9 Report Conclusion 57 2Chapter 1 Introduction Predicting the Stock Market has been the bane and goal of investors since its existence. Everyday billions of dollars are traded on the exchange, and behind each dollar is an investor hoping to profit in one way or another. Entire companies rise and fall daily based on the behaviour of the market. Should an investor be able to accurately predict market movements, it offers a tantalizing promises of wealth and influence. It is no wonder then that the Stock Market and its associated challenges find their way into the public imagination every time it misbehaves. The 2008 financial crisis was no different, as evidenced by the flood of films and documentaries based on the crash. If there was a common theme among those productions, it was that few people knew how the market worked or reacted. Perhaps a better understanding of stock market prediction might help in the case of similar events in the future. 1.1 Project Goals and Scope Despite its prevalence, Stock Market prediction remains a secretive and empir- ical art. Few people, if any, are willing to share what successful strategies they have. A chief goal of this project is to add to the academic understanding of stock market prediction. The hope is that with a greater understanding of how the market moves, investors will be better equipped to prevent another finan- cial crisis. The project will evaluate some existing strategies from a rigorous scientific perspective and provide a quantitative evaluation of new strategies. It is important here to define the scope of the project. Although vital to any investor operating in the real world, no attempt is made in this project at portfolio management. Portfolio management is largely an extra step done after an investor has made a prediction on which direction any particular stock will move. The investor may choose to allocate funds across a range of stocks in such a way to minimize his or her risk. For instance, the investor may choose not to invest all of their funds into a single company lest that company takes unexpected turn. A more common approach would be for an investor to 3invest across a broad range of stocks based on some criteria he has decided on before.This project will focus exclusively on predicting the daily trend (price movement) of individual stocks. The project will make no attempt to deciding howmuchmoneytoallocatetoeachprediction. Moreso, theprojectwillanalyse the accuracies of these predictions. Additionally, a distinction must be made between the trading algorithms studied in this project and high frequency trading (HFT) algorithms. HFT algorithms make little use of intelligent prediction and instead rely on being the fastest algorithm in the market. These algorithms operate on the order of fractions of a second. The algorithms presented in this report will operate on the order of days and will attempt to be truly predictive of the market. 4Chapter 2 Considerations in Approaching the Problem Throughout the project, there are three ideas that warn us that we might not find a profitable way to predict market trends. 2.1 Random Walk Hypothesis The random walk hypothesis sets out the bleakest view of the predictability of the stock market. The hypothesis says that the market price of a stock is essentially random. The hypothesis implies that any attempt to predict the stock market will inevitably fail. The term was popularized by Malkiel 13. Famously, he demonstrated that he was able to fool a stock market ’expert’ into forecasting a fake market. He set up an experiment where he repeatedly tossed a coin. If the coin showed heads, he moved the price of a fictitious stock up, and if it showed tails then he moved it lower. He then took his random stock price chart to a supposed expert in stock forecasting, and asked for a prediction. The expert was fooled and recommended that he buy the stock immediately. It is important for the purpose of this project to confront the Random Walk Hypothesis. If the market is truly random, there is little point in continuing. 2.1.1 Qualitative Similarity to Random pattern The stock market can certainly look random to the eye of a casual observer. To demonstrate this, we created a perfectly random process that had striking visual similarity to real stock market data. The random process used to generate this is defined in equation 2.1 where a = 0, 0.995≤ ρ 0, q and r are random 0 values taken from a standard normal distribution. b can be initialised at any 0 desired number 5a =a ∗ρ+q n n−1 n b =b +ρ∗r (2.1) n n−1 n f(n) =a +b n n Asequencegeneratedusingequation2.1isdisplayedinfigure2.1. Thegraph plots an example generation of f(0),f(1),...,f(500). Figure 2.1: Example of a randomly generated pattern We then visually compare this random process to a real piece of market data in figure 2.2. 6Figure 2.2: Centered APPL stock price, some time after 2010 Presented with both of these diagrams, and without the aid of time scales or actual prices, most people would find it impossible to differentiate the diagrams. Using visual inspection alone, either of these diagrams could just as likely be a real piece of stock market data. This gives us pause as there is little point in moving forward if the stock market is truly random and there is nothing to predict. However, this does not turn out to be the case. 2.1.2 Quantitative Difference to Random pattern In this section, we will demonstrate that prices, specifically the way they move, are fundamentally different to random data. Karpio et al. 11 describe an asymmetry between gains and losses on the stock market. Their research looks specifically at indices like the Dow Jones Industrial Average. They describe how "you wait shorter time (on average) for loss of a given value than for gain of the same amount". This research was conducted in 2006, before the Great Recession. It is conceivable that the market conducts itself differently since then, and therefore we tried to replicate their findings. On every day from the year 2000 to 2014, we recorded the value of the Dow Jones index. We then counted the number of days it took for the value to gain or lose 5% of its original value. When it lost 5% of its value, it was put into the red set, when it gained 5% of its original value, it was put into the green set. 7Figure 2.3 shows two overlaid histograms detailing how long it took the red and green sets to lose or gain 5% respectively. Figure 2.3: Gain-Loss Asymmetry on the Dow Jones Figure 2.3 shows that the green set, the instances that gained 5% of their original value, took longer on average to reach that point than the red set, the instances that lost 5% of their original value. This indicates that the market generally creeps upwards but is prone to sudden drops downwards, and supports the findings of Karpio et al. 11. This demonstrates that the stock market is fundamentally different to ran- dom data. This gives us hope for the remainder of the project. If the market price is not random, then it might be worth investigating and trying to predict. 2.2 Efficient Market Hypothesis Another concept to keep in mind while working on the project was the Efficient Market Hypothesis. Informally, the Efficient Market Hypothesis says that the market is efficient at finding the correct price for the stock market. It comes in three flavours. However it is still a matter of debate which one, if any, is correct. Weak-form Efficient Market Hypothesis Theweakformofthehypothesis 8says that no one can profit from the stock market by looking at trends and patterns within the price of a product itself. It is important to note that this does not rule out profiting from predictions of the price of a product based on data external to the price. We will see examples of prediction based on both in sample and out of sample data, and provide evidence in support of the weak form. Semi-Strong Efficient Market Hypothesis The semi-strong form rules out all methods of prediction, except for insider trading. This means that if we are only to use public domain information in our prediction attempt, the semi-strong form says that we will be unsuccessful. Strong form Efficient Market Hypothesis The strong form says that no one can profit from predicting the market, not even insider traders. Clearly, if we are to predict the stock market using only public information, we must hope that at most the weak form of the Efficient Market Hypothesis is true so that at least then we can use external data to predict the price of a product. 2.3 Self Defeating Strategies Finally there is the idea of a successful model ultimately leading to its own demise. The insight is that if there were a simple predictive model that anyone could apply and profit from themselves, then over time all of the advantage will be traded and eroded away. This is the same reason for the lack of academic papers on the topic of profitably predicting the market. If a successful model was made widely known, then it wouldn’t take long until it wouldn’t be successful any more. 2.4 Conclusions The three preceding ideas ask us to keep an open mind on stock market predic- tion. It is possible that we will not be able to do it profitably. 9Chapter 3 Review of Existing Work Inthissection, wewillreviewexistingacademicliteratureinregardtopredicting the stock market. We will look at two articles, one in support of technical analysis methods and another in support of machine learning methods. We will see that both leave room for improvement. 3.1 Article 1 - Kara et al. 10 The first article we will review is Predicting direction of stock price index move- ment using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange by Kara et al. 10. The article uses technical analysis indicators to predict the direction of the ISE National 100 Index, an index traded on the Istanbul Stock Exchange. The article claims impressive results, up to 75.74% accuracy. Technical analysis is a method that attempts to exploit recurring patterns and trends within the price of the stock itself. It goes directly against all forms of the Efficient Market Hypothesis. As described previously, even the weak form of the Hypothesis rules out prediction using historic price data alone. The team uses a set of 10 technical analysis indicators which are listed below. • Simple 10-day moving average • Weighted 10-day moving average • Momentum • Stochastic K% • Stochastic D% • Relative Strength Index • Moving Average Convergence Divergence • Larry Williams R% • Accumulation Distribution Oscillator • Commodity Channel Index The daily values for the indicators are calculated and coupled with the daily 10price movement direction, the dependent variable. Two types of model are tested; a support vector machine and a neural network. Results are cross- validated using a single-holdout method. This method of cross-validation is known to be inferior when compared to other techniques such as k-fold cross- validation 12, but it is unlikely that this would have a drastic effect on the results presented in the article. The team’s neural network model had an accu- racy of 75.74% and their support vector machine had an accuracy of 71.52%. Such success using technical analysis is in sharp contrast with work that will be presented later in the report. One explanation for the difference in results is that the ISE National 100 index may be more accommodating to technical analysis indicators than the set of stocks reviewed later in this report. This is understandable as all stocks in this set are traded on the New York Stock Exchange, which is the most computerised exchange in the world. It is likely that there are less trading algorithms competing on the Istanbul stock exchange and it is therefore easier to leverage algorithmic approaches such as Technical Analysis. However, there is another flaw however in the methodology of the article that is far more likely to have dramatically increased the performance of their algorithms. In section three of the article, the team describes how they prepared their research data. Specifically, The direction of daily change in the stock price index is catego- rized as “0” or “1”. If the ISE National 100 Index at timet is higher than that at time t− 1, direction t is “1”. If the ISE National 100 Index at time t is lower than that at time t−1, direction t is “0”. At first this would appear sensible, but their features also make use of infor- mation from timet. For example, their moving average calculation is defined in equation 3.1 where C is the closing price at time t. t C +C +...+C t t−1 t−10 Simple 10-day moving average = (3.1) 10 In 3.1, the simple moving average is calculated using the closing price at time t and t− 1. In fact all features used in the article use some information from time t and t−1, as well as others. But recall that the dependent variable is a classification of the difference between prices at times t and t− 1. This means, in effect, that they are using information about tomorrow to predict tomorrows price. In a real world situation, no trader or trading algorithm would have the same level of information about tomorrow when they are making their predictions. If we wish to build a stock market prediction model that is useful in the real world, then we must only provide the model with information it could have in actuality. In reality, no model would have access to information about tomorrow when making a prediction. This is ignored by the article which provides its models with features which contain a large amount of information 11about tomorrow. This can not be ignored when considering the team’s reported success. Nevertheless, it can be argued that the work may still be useful. Although the impressive figure of 75.74% prediction accuracy would be impossible to replicateinarealworldmodelbecauseoftheproblemsdescribedabove, perhaps ittellsussomethingelse. Wecouldreinterpretthe75.74%accuracyresultnotas predictive accuracy, but how well the technical analysis indicators agreed with true price movement on a given day. In other words, even though the indicators ultimately had access to information which included the true price movement, they agreed with it 75.74% of the time. Using this somewhat liberal interpretation, Kara et al. 10 have not demon- strated the predictive power of technical analysis, but demonstrated that it might be worth investigating. Later in this report, we will apply the corrected methodology and attempt to build a model using some of the same features. 3.2 Article 2 - Shen et al. 19 The second article we will look at is Stock Market Forecasting Using Machine Learning Algorithms by Shen et al. 19. The article makes a case for the use of machine learning to predict large Americanstockindices, includingtheDowJonesIndustrialAverage. Thearticle boasts a 77.6% accuracy rate for the Dow Jones specifically. The team uses a set of 16 financial products and and uses their movements to predict movements in American stock exchanges. Many of these products will be used later in this report. Some of the financial products used are listed below. • FTSE index price • DAX index price • Oil price • EURO/USD exchange rate The article makes good use of explorative methods in their data preparation stage. Theyshowusinggraphstherethereisasomefeaturesmayhavepredictive power because of their correlation to the NASDAQ index. They then go on to perform feature selection based on the predictive power of each feature on its own. The results presented in this section, the predictive power of single features, are very similar to results that we will show later in this report. After they have selected their top 4 features they compare a Support vector machine model to a Multiple Additive Regression Trees model for predicting the daily NASDAQ trend. Their winning model is the SVM with 74.4% accuracy. While the results presented by Shen et al. 19 in this article appear to be very much in line with results that will be later presented in this report, it is quite vague in terms of the methodology that was used. For instance, there is 12no record of what model was used in the feature selection step or how cross- validationwascarriedouttocalculateanyoftheirresults. However, theirresults will be supported by results in this report, so it is safe to assume that no critical errors were made in these steps. After they successfully train and test a model, they use it to simulate a real life trading environment. Their models perform extremely well on the simulated trading example with an average of an 8% return rate every 50 days. At this point, the results of Shen et al. 19 and the results of this report diverge. The simulation in the article fails to account for overlapping trading hours. For instance the FTSE, which is traded in London, and the Dow Jones, which is traded in New York, are both trading simultaneously for three to four hours each day. This is enough time for the New York stock exchange to influence the London Stock stock exchange. It follows then that it is incorrect to use the closing prices in London to predict New York’s movements. While training and testing the models, both the article and this report ig- nore the overlapping of trading hours. However, this should certainly not be ignored when estimating real world trading performance. Later in this report, we will simulate real world trading using Quantopian. In Quantopian we will have access to intraday prices which will allow us to correctly account for the overlapping trading hours. 13Chapter 4 Data and Tools 4.1 Data Used 4.1.1 Choosing the Dataset For this project, we chose the Dow Jones Industrial Average and its components as a representative set of stocks. The Dow Jones is a large index traded on the New York stock exchange. It is a price-weighted index of 30 component companies. All companies in the index are large publicly traded companies, leaders in each of their own sectors. The index covers a diverse set of sectors featuring companies such as Microsoft, Visa, Boeing, and Walt Disney. It is important to use a predefined set of companies rather than a custom selected set so that we do leave ourselves open to methodology errors or accu- sations of fishing expeditions. If we had selected a custom set of companies, it could be argued that the set was tailored specifically to improve our results. Since the aim of the project is to create a predictive model of stock markets in general, this is a claim that we want to avoid. The Dow Jones was chosen because it is well known, and has a relatively small number of components. The leading index, the S&P 500, has over 500 components, butultimatelyanalysing500companieswouldbetoocomputation- ally expensive for the resources at hand. The Dow Jones components provided a good balance between available data and computational feasibility. Although there were only 30 companies, for many of the experiments carried out in this report, datasets many thousands of examples in size were obtainable. 4.1.2 Gathering the Datasets A primary dataset will be used throughout the project. The dataset will contain the daily percentage change in stock price for all 30 components of the Dow Jones. Luckily, daily stock price data is easy to come by. Google and Yahoo both operate websites which offer a facility to download CSV files containing a full 14daily price history. These are useful for looking at individual companies but cumbersome when accessing large amounts of data across many stocks. For this reason, Quandl5 was used to gather the data instead of using Google and Yahoo directly. Quandl is a free to use website that hosts and maintains vast amounts of numerical datasets with a focus specifically on eco- nomic datasets, including stock market data which is backed by Google and Yahoo. Quandl also provides a small python library that is useful for accessing the database programmatically. The library provides a simple method for cal- culating the daily percentage change daily in prices. This calculation is defined in equation 4.1 where p is the closing price on day d, and δp is the resulting t d percentage change. p −p d d−1 δp = (4.1) d p d−1 To build the primary dataset, a simple program was built to query the Quandl API for each of the 30 Dow Jones components using the transformation outlined in equation 4.1. Before finally saving the data, the gathered data was augmented with an additional column containing the classification of the daily percentage change. This augmentation is defined in equation 4.2 where trend d is the price movement direction on day d and δp as defined in equation 4.1. d   Loss if δp 0 d  trend = Gain if δp 0 (4.2) d d   Neutral if δp = 0 d The final step is shifting all of theδp andtrend data points backwards by d d one day. The data we have gathered for any day should be used to predict the trend tomorrow, not the same day. By shiftingδp andtrend backwards, data d d gathered in later sections will be paired with the correct dependent variable. For instance, data we gather for a Monday will be matched with, and try to predict, Tuesday’s trend. This dataset was then saved in CSV format for simple retrial as needed throughout the project. This dataset containing the daily trends of companies will serve as the core dataset that will be used in most experiments later in the report. When we want to use the dataset later with the extra data we collect for each experiment, we only need to do a simple database join operation on the company and date. The dataset contains 122,121 rows covering all 30 companies daily since January 1st 2000. Table 4.1 shows an extract of a number of rows from this dataset. 15Figure 4.1: Data Extract of Primary Dataset Date Symbol δp trend d 2000-01-05 DD 0.04230 Gain 2000-01-05 DIS 0.03748 Gain 2000-01-05 GE -0.00749 Loss 2000-01-05 GS -0.04248 Loss 2000-01-05 HD -0.00097 Loss 2000-01-05 IBM 0.03515 Gain With the primary dataset prepared, we will combine it as needed with ad- ditional datasets to carry out individual experiments. 4.1.3 Limitations of the Data Although the data gathered in the previous section is certainly a good start, it is admittedly far behind what any serious investor more than likely has access to. One obvious piece of missing data that is the intraday prices, i.e the prices minute by minute. It is possible that this data could be used to guide and investors decisions on the interday level. However, intraday prices are not as freelyavailableasinterdaypricesandareconsideredacommodityinthemselves. To get hold of such a dataset would incur a large cost, one that is not within the budget of a project such as this. Later in the project we will evaluate a strategy in which this limitation becomes significant. Another important piece of missing data is the order book. The order book is a record of live buy and sell orders for a particular stock. It consists of the amount of stock each trader is willing to buy or sell, as well as their price. Successful orders are matched off against the order book by the exchange. The price of a stock is usually considered to be half way between the highest buying price and the lowest selling price. It is easy to imagine that the order book contains useful data. For instance, the weighted average of orders might be predictive of the price. However access to this data is extremely costly and far beyond what most casual investors can afford, let alone the budget for this project. With no way around these limitations, we use the data provided by Quandl. 4.2 Tools Python and associated packages Python was the language of choice for this project. This was an easy decision for the multiple reasons. 161. Python as a language has an enormous community behind it. Any prob- lems that might be encountered can be easily solved with a trip to Stack Overflow. Python is among the most popular languages on the site which makes it very likely there will be a direct answer to any query 17. 2. Python has an abundance of powerful tools ready for scientific comput- ing. Packages such as Numpy, Pandas, and SciPy are freely available, performant, and well documented. Packages such as these can dramati- cally reduce, and simplify the code needed to write a given program. This makes iteration quick. 3. Python as a language is forgiving and allows for programs that look like pseudo code. This is useful when pseudo code given in academic papers needs to be implemented and tested. Using Python, this step is usually reasonably trivial. However, Python is not without its flaws. The language is dynamically typed and packages are notorious for Duck Typing. This can be frustrating when a package method returns something that, for example, looks like an array rather than being an actual array. Coupled with the fact that standard Python documentation does not explicitly state the return type of a method, this can lead to a lot of trial and error testing that would not otherwise happen in a strongly typed language. This is an issue that makes learning to use a new Python package or library more difficult than it otherwise could be. 17