I’m writing a book on ballistics, based in part on my work on ballistipedia.com, and I got to the point that I wanted a raw ballistic calculator to create examples. I found a public implementation of a well known calculator in (programming language) Python and decided to start with that.
Nothing gets you up to speed on software engineering tools like having (a) a project you want to complete and (b) existing code to work from. In this case the existing code used a lot of Cython, which is a hybrid of Python and C that I have been meaning to pick up for a long time. I have been using Python more than any other language for the last six years. But I began coding in C more than thirty years ago. So why did I end up using Python if C is still relevant?
Some features of Python:
It’s a high-level language that encourages concise and readable code.
It’s the most popular programming language. Whatever you want to do, you can probably find existing (and free) Python packages that get you most of the way there. For example: I recently needed to scrape a bunch of data from a website. Bing’s chatbot recommended a package called selenium, and in less than an hour and 30 lines of Python I was done.
It is interpreted, not compiled. This makes it fast to write and easy to prototype in, but it also means the code runs relatively slowly.
In the past decade computers have gotten so fast that the last point is usually not an issue. Lots of production systems are running in Python. But when you realize how much slower Python runs this is surprising: As an example, I just made a Cython version of a simple statistical simulation I had written in Python. The Cython version can run 10 million simulations in 15 seconds. In the same time, the Python version completes only 75 thousand simulations. The compiled Cython version is over 100 times faster!
So how does a programming language that runs two orders of magnitude more slowly than the alternatives become so common? This is interesting, because Python has been around for 30 years, but its use has only really exploded in the last decade.1 The other thing that has exploded over this timeframe is the amount of excess compute – a term that refers not only to processing speed but also to storage capacity and information transmission bandwidth. A generation ago we marveled that a student calculator had more compute than the guidance systems that landed man on the moon. Today, there is so much compute in a typical smartphone that developers rarely have to worry about speed or memory. And so we use tools to build and publish apps that are hundreds of times larger and slower than necessary because (a) they’re easier and (b) it’s no longer worth the effort to make them small or fast.
Have you ever noticed the size of installation files for smartphone apps? Very few weigh in at less than 10 megabytes. Heck, the basic calculator app on my phone takes up 8MB of memory! Why? 30 years ago the same app under Windows 3.1 used 41kB of memory (that’s 1/200th the size).
It’s not a coincidence that I didn’t use Python much earlier. Most of my work is in the finance industry. In the 1990s we could never get enough storage for market data or speed to run analyses, so we spent a lot of time optimizing our C++ code to squeeze as much as we could out of our compute resources.
In the early 2000s I still knew every detail of every available piece of computer hardware because I was working them to their limits. I still remember my excitement to get a server with two 64-bit Opteron CPUs and to break out of 32-bit memory space! The early 2000s were also the point at which we could afford to use a SQL database instead of handling every detail of reading and writing our data to files and keeping track of exactly what was being held in RAM at each moment.
The last decade is when we could get really sloppy. With gigabytes of RAM on every computer it’s usually possible to keep everything of interest in memory; no need for a database. Where previously I used C++ or C# to build production systems for trading and portfolio management, computers are now so fast that I actually built and run my most recent system in Microsoft Excel! I’m not kidding: Here’s a screenshot of Excel (left) consuming realtime data and generating portfolio analysis next to an Interactive Brokers Trader Workstation (right):
TIOBE publishes data on programming language popularity. Stack Overflow is another good indicator, which is where I got this chart: