I discovered strace somewhere between my first part time web development part time job in 2005 and my first full time “software engineering” job in 2008, and it seemed like a superpower giving me x-ray vision into running infrastructure. When a process was stuck, or existing after a cryptic error message, instead of grepping around I could get a pretty good timeline of what the process was up to.
It has some fatal flaws, the ptrace based technology can cause performance issues so it’s mostly not suited to running in production (unless you were one of my personal heros: the old mysql@facebook team which liked to live dangerously and often used ptrace based debuggers to obtain life profiling data).
Recently I read a fantastic article walking through jit optimizations and how changes to source code could impact those: Side effecting a deopt.
As I shared it with folks, a few of them had some questions about low level optmizations in general and I wrote this as a little explainer for people who are interested in learning more about how javascript runtimes can model/compile/jit/execute their js code. So I wrote this explainer to go along with the original article.
Tl;Dr Cgo calls take about 40ns, about the same time encoding/json takes to parse a single digit integer. On my 20 core machine Cgo call performance scales with core count up to about 16 cores, after which some known contention issues slow things down.
Disclaimer While alot of this article argues that “Cgo performance is good actually”, please don’t take that to mean “Cgo is good actually”. I’ve maintained production applications that use Cgo and non-trivial bindings to lua.
So after several years of reading oversimplified and flat out incorrect comments about threads and fibers/goroutines/async/etc and fighting this reaction:
I’ve decided to write my own still-over-simplified all in one guide to the difference between a couple popular threads and fiber implementations. In order to keep this a blog post and not a novel I’m just going to focus on linux threads, go goroutines, and rust threads.
tl;dr - Rust threads on linux use 8kb of memory, Goroutines use 2kb.
Load testing tips Over a decade plus of getting retailers ready for a smooth Black Friday I’ve collected a few tips, tricks, and stories related to keeping busy applications online during big events.
In fact there’s one simple (not easy!) trick to it: the best way to ensure your website can handle a big event is to have your website handle a big event. That may seem like a tautology, but it’s where this post starts and it’s where it ends.