MeTal is far from my only programming project, although it's probably the most significant. One of the other things I've been dabbling in is a recreation of a classic 1980s arcade game, also in Python. Working on it has provided me with some perspectives about the much-discussed issue of Python performance.
Some things become plain when you write games or other performance-intensive applications in Python vs., say, C/C++:
You inevitably trade performance across the board for convenience.
You can gain back a lot of that performance, but not all of it.
How much you gain back is largely a matter of what you, the programmer, are comfortable with, and not necessarily a technical barrier.
The last of these three is, I think, the most important issue and also the most underappreciated one. While working on Robotron, I was able to get performance that was pretty respectable on modern hardware -- between 1% and 2%, sometimes a tot more, on my 8-core desktop, and between 5-8% on my quad-core notebook (vintage 2012). On my slow-as-frozen-molasses HP Stream 11, with a whopping TWO cores, I got 25-30% -- but at least it ran, and it even maintained framerate.
For perspective, Crossy Road runs on the desktop at about 3-6% CPU.
For additional perspective, Robotron originally ran at something like 25% CPU, sometimes even more than that, on the notebook. I was able to shave it down not by doing any one thing, but by way of a whole mess of things:
Better use of OpenGL textures for objects. (This by itself provided a massive speedup.)
Rewriting the internal object queue so that instead of iterating over all game objects in a queue, I push the queue to an object and have it iterated over internally -- in essence, having only a single function call instead of dozens, 60 times a second.
Using Cython to provide speedups for the modules in the game that were the most CPU-intensive. The biggest culprit was the collision detection algorithm; moving that to C shaved off a huge amount of CPU.
I also got some boosts by rewriting the draw loop and in-lining some of the code that was normally masked behind a bunch of function calls in the underlying graphics library.
I also compiled some of the graphics library methods with Cython, although the speedup I got there was nominal. The best speedups with Cython come from annoptating functions with type references; this is a little difficult with a library that spends most of its time passing around Python objects.
I disabled garbage collection except between waves or lives, as a way to further reduce jank.
I made sure the thread-switching timer inside the Python interpreter allowed all threads to run to completion before switching -- another way to reduce jank and keep cache and pipeline coherency.
I set the game process to use single-core affinity -- again, another way to keep cache coherency, since the threads aren't spread out across multiple cores.
There's only so much speedup you can get from these tricks, before the overhead of the Python interpreter itself starts becoming the real barrier.
Now, for all I know, there may well be other things that provide even more speedup. I haven't tried PyPy, for instance, if only because it doesn't work very well with this project (yet), and when I tried an earlier build of Nuitka, it gave me results that were far worse than my own hand-optimized build. All of those avenues may well yield better results in the future.
If I had written the whole thing in C/C++, with something like Allegro or SDL2, I would easily have seen a 10x boost in speed, if not more. But that, too would have come at multiple costs:
Python is far easier and more comfortable for me to develop in that C/C++.
Speed of development would have been much slower.
I wouldn't have access to any of Python's conveniences.
I doubt the performance gain for this particular game would have mattered much on modern hardware. I would have seen pretty big improvements in my HP Stream 11, but how many people really run games on such hardware?
All programming is about choosing between tradeoffs. For me, development comfort matters; I don't like C/C++ syntax, and Python is just easier for me to work with and reason about. I do plan on getting more C/C++ under my belt over time, though -- just as a way to see where knowing all that takes me. But for now, Python and I are tightly knit.