Skip to content

John Lam's Blog


That was a F1 championship for the ages; certainly the best one that I can remember since I started watching a decade ago during the Vettel era. A bit of luck certainly contributed to Max's win, but can we talk about just how well the Red Bull team managed Max during the race? The call to pit for new hard tires under the virtual safety car and the call to pit for new soft tires under the safety car were clearly the calls that made the difference in a race where it was clear that Lewis had the faster car.

Mercedes really need to look in the mirror here as what they needed to do was cover off any pit stops that Max did. They will of course appeal to the FIA, but I can't imagine them changing the title in the courts.

Something else to talk about: Sergio Perez. In the era of selfish F1 teammates his sacrifice and ultimate retirement this weekend clearly enabled Max to get not only pole position but possibly the race itself. His epic battle with Lewis made up a huge gap that let Max back into contention in the race.

Regardless of what happens in the end, this was an epic end to an amazing season. Both Lews and Max deserved to win this race. Lewis behaved like the GOAT that he is and was so gracious in congratulating Max, gutting as it must have been for him. It was also great to see, in the back of the paddock, this scene with Anthony Hamilton and Jos Verstappen congratulating Max on his win.

Also these scenes with the fathers consoling and congratulating their sons:

These are the human moments in sport, and I'm grateful to have been here to see this. #


A recent interview with Anders Hejlsberg!

Anders Hejlsberg

I found this on Hacker News

  • His brother is one of the interviewers
  • They started programming 40 years ago in Copenhagen
  • Started using high school computer in the 1970s
  • Started by implementing a programming language, Turbo Pascal
  • Cut his teeth by adding extensions to existing language to make it useful
  • C# was first language designed from scratch
  • "You can always add but you can never take away"
  • The perfect programming language is one with no users(!)
  • V1.0 is the only greenfield
  • Game of learning to say no vs. yes
  • Language features asks come from people with an instance of a problem - he tries to find the class of problems that it belongs to
  • HTML is an incomprehensible mess :)
  • TypeScript was the first foray of a project that, from the onset was open source
  • Wanted Roslyn to be open source, but C# and .NET was not open source
  • Roslyn was not built as open source, was open sourced later
  • Open development - 2014 moved to GitHub - everything is on GitHub is the next step, e.g., design notes - close to users
  • Anders participates in C# design committee, but Mads has been running it for the last decade(!)
  • Multiple inheritence - usefulness does not outweigh the downsides of additional complexity
  • If he could change one thing in C# he would have nullability and value/reference types as orthogonal issues, e.g., you cannot have non-nullable reference types
  • Tony Hoare called inventing null his billion dollar mistake
  • Functional programming languages do not enable circular reference data types (e.g., double-linked list or trees with back-pointers)
  • Opting into nullability in specific areas
  • Don Syme did most of the implementation of generics in .NET 2.0
  • He regrets dynamic in C# 4.0
  • He likes the work done in C# 5.0 for async/await
  • TypeScript - how solve the nullability problem. He is very happy with that result.
  • Turbo Pascal 1.0 symbol table was a linked list(!)
  • He read Algorithms + Data Structures = Programs by Niklas Wirth - learned about hash tables and then reimplemented the symbol table in Turbo Pascal 2.0 using them local copy in case this disappears from the web
  • He learned by trial and error without formal background


I like it when things that I read and don't agree with get me thinking about an idea. This post by David Perell spent some time rattling around in my subconscious:

I didn't agree with it because almost all software that I use today requires me to Google something to figure out an obscure feature - there are always obscure features especially in software that I don't use all the time. If I don't know how to do something already using the UI, I would immediately Google it and follow the directions on how to accomplish the task.

I was out riding my bike today, and listening to Neal Stephenson being interviewed by Lex Fridman. During the conversation they talked about Google and search. The idea that popped into my head then was that UI is great for idiomatic operations and search is great for everything else.

The example in my head was VS Code. The idiomatic operations are provided by the UI, e.g., vi keybindings for navigating a document, tabs for managing multiple documents, a file explorer for viewing the contents of your project, a debugging tab for viewing the state of variables in the debugger etc. IMHO the real innovation in VS Code is the command palette which lets you search for the command that you want - this avoids over-complicating the UI with endless toolbars. VS Code users quickly learn to use the fuzzy search in the command palette to find the command that they are looking for. They even have the ability to bind those commands to a custom keybinding to turn it into an idiomatic operation if they use that command often enough.

I think this strikes the right balance between a rich featureset and a simple, idiomatic UI. Unfortunately, it seems that in our attempt to simplify everything for the novice user, we have wound up with UIs that have way too many layers of UI (I'm looking at you, Microsoft Teams).

I wonder what a better experience for search would be on mobile though? A command palette is a lot more difficult to use on a phone. #


There is some good news coming out of South Africa today - it looks like the probability of severe outcomes from Omicron is lower than Delta:

However, this is not the way to handle the Delta and nascent Omicron wave in USA:


OK. We have some pretty good news coming out of South Africa on Omicron - it looks like the probability of severe outcomes from Omicron are lower than Delta!

Unfortunately, it looks like we're trending the wrong way on Delta in the US:


Our talk yesterday was about using Transformer models to win a Kaggle competition at a training cost of less than $50 on Azure. Fortunately, Alon understands what Transformer models are, and he did a wonderful job of summarizing what Transformer model is in about 5 minutes.

I really wanted to learn more about Transforrmer models; I've been treating them mostly as a black box and using pre-trained Transformer models to make cool things like my semantic wine review search engine. Coincidentally, this morning someone from work linked this tweet:

and he subsequently found a link to this fantastic explanation of Transformer models called Transformers From Scratch written by Brandon Roher (hi Brandon!) I'm still working through the piece but one thing that I had not understood before was how matrix multiplication and one-hot encoded vectors are used to do branch-free selection of rows from a table. Let that sink in for a minute: how would you do that WITHOUT COMPARISONS and BRANCHING? 🤯

Apparently this is one of the key insights from Transformers. There's a whole lot of branching and comparing in this Python list comprehension:

[x for x in ['fizz','buzz','fizz','fizz'] if x == 'fizz'] #

Continuing to take notes on Brandon's tutorial on the flight home. There is a concept of selective masking that is in the original Transformers paper. Here is an annotated Attention is All You Need paper.

In his explanation, there are a large number of unhelpful predictions where there is a 50:50 probability of some outcome in his highly simplified example. Masking is a concept to drive low probability events to zero to eliminate them from consideration, and is the central idea in Transformers.

He summarizes the first part of his explanation through three ideas:

  1. Turning everything into matrix multiplication is a good idea. As I observed above, being able to select rows out of a table (or matrix) by doing nothing more than matrix multiplication is incredibly efficient.

  2. Each step during training must be differentiable, i.e., each adjustment to a parameter must result in a calculation of the model error / loss function.

  3. Having smooth gradients is really important, He has a nice analogy between ML gradients and hills/mountains/valleys in the real world. He describes the art of data science in ensuring that the gradients are smooth and "well-conditioned", i.e., they shouldn't quickly drop to zero or rise to infinity.



Today was talk day in Las Vegas. I will have a lot more details to write about the talk later, but giving this (non-recorded) talk was a super-fun experience both for me, and my co-presenter Alon Bochman. We had a great conversation afterwards, and among the many things we talked about was the importance of Beer Mode to me. I was reminiscing about the early days of the Ruby community that I was so fortunate to have been a part of and how the serendipitous conversations I had at those early RubyConfs forever changed my life.

Speaking of serendipity, Hacker News is one of my best sources for random Beer Mode input. I saw Rockstar: a language for programs that are also hair metal power ballads tonight on HN and in the comments (unlike YouTube, always read the comments on HN), I discovered this incredible talk by Dylan Beattie the guy who spec'd, implemented and performed a program written in Rockstar.

This instantly became one of my favorite talks of all time. It looks at programming as an art form - especially given that all his examples which range from Conway's Game of Life through to the mathematics behind Mendelbrot plots, the esoteric world of programming quines (programs that print themselves as output) and polyquines which are beyond incredible, and finally his performance of a program written in Rockstar.

Watching this reminded me of Giles Bowkett's incredible Archaeopteryx talk that I was fortunate enough to have watched live at RubyConf Salt Lake City almost a decade ago. He was featured recently in another book that I love called So Good They Can't Ignore You by Cal Newport. This is a masterful talk that was also a piece of performance art.

This sent me down memory lane thinking about incredible talks that I have seen in the past. This talk by Guy Steele Jr. was and remains at the top of my list of most incredible performance art talks that I have seen. This also relates somewhat to Ruby as I attended my one and only OOSPLA in 2005 just before I attended my first RubyConf in 2005 as part of a "learning vacation" that I took back then. So good - it starts off super weird but I promise you it's worth sticking through to the end.


I finished implementing the zola serve preview in my vscode-zola VS Code extension. It contains everything that I know (which isn't very much) about writing extensions, JavaScript and TypeScript. I also learned how to package my extension into a VSIX using the vsce tool so now I can write using VS Code and not have to first build and run my extension from another VS Code! If I find that I like how I use it over the next few days, I'll publish it to the VS Code Marketplace and post a link to the zola community. #


I'm getting ready to fly to Las Vegas for my first talk in ages: Train a Kaggle Winner Under $50 on Azure. One thing that I'm planning to share with my audience is this amazing Derek Sivers TED Talk on how to start a movement; it could just as easily be how to start an Open Source project as well!

At my talk, I'm going to be releasing a project that I've been working on for a while: ez that hopefully makes using Azure as a powerful personal computer much easier than it is today.



Beer mode is extremely important to me: it's how I discover new ideas serendipitously. Twitter is a great place for beer mode activities, but it's also a toxic cesspool full of shitposting and virtue signaling. It's entertaining, exhausting, and distracting.

Fortunately, there are a small number of users who consistently post great content focused on an area that I'm interested in. They are, essentially, "professional tweeters". Or, if you prefer terms from a bygone era, they are micro-bloggers.

You can configure tweetdeck, which is the only Twitter client you should use, as it lets you configure exactly how you want to view tweets and you can remove the algorithmic timeline (and ads) from it entirely. At first, I configured tweetdeck to follow individual users which worked great and cut down the noise considerably.

But I still saw replies from those users. While that isn't necessarily a bad thing, it's really random when you see it in a column as the context isn't really there. I really prefer starting from the top of a thread and then seeing any replies that they may be making to other users on their thread. The trick that I learned to do this is to create search columns instead of user columsn. You just need to use this format for the advanced search syntax:

from:user_name -filter:replies

That's it! Here's a screenshot of how this looks in tweetdeck #


Still thinking about how to optimize this blog by capturing screenshots of things like tweets, Instagram posts, or YouTube thumbnails. While it is certainly possible to do all of this manually with existing OS screenshotting tools, it would certainly be more efficient to remove the friction from the process by simply working with links.

An idea that I had this morning was to use a headless browser and some automation to capture tweets, YouTube videos etc. Ideally this is something that can be extensibile by other motivated contributors.

There is some prior art in this area, as this post discusses. There is also a Twitter screenshots Chrome extension by the same guy that created screenshot guru. GH Repo #

David Perell continues to deliver value to me. In this tweet, he advocates for studying debate as a way to better frame ideas. I've long thought about this as "not being mired in the details of an idea", but I can see how this is also about jumping to conclusions. For example focusing on a detail implies you know that the detail is somehow a foregone conclusion already, instead of exploring and convincing others about the problem. Framing the problem is something that I want to get better at, as, as David so concisely describes, "... the highest leverage thing to do".



Today's morning read is a wonderful essay that hit the top of HN titled 100 years of whatever this will be. The key thing that Avery Pennarun talks about is the importance of centralized regulation in distributed systems. He also talks about just how hard it is to build a reliable distributed system from his perspective of being the co-founder of TailScale. He also evokes (with lots of examples) the observation that decentralized markets trend towards chaos over time and why it's important to centrally regulate them.

The job of regulation is to stop distributed systems from going awry. Because distributed systems always go awry.

He references another essay The Tyranny of Structurelessness that says "if you don't have an explicit hierarchy you have an implicit one" and then brilliantly applies it to look at why your distributed system can fail.

The rant about technology at the start is awesome as well, an example:

IT security has become literally impossible: if you install all the patches, you get SolarWinds-style supply chain malware delivered to you automatically. If you don't install the patches, well, that's worse. Either way, enjoy your ransomware.

Worth noting that this is a thinly veiled takedown of the DeFi and crypto movements in general and he accuses them of making handwavy arguments around the reliability of those systems ("it's decentralized so it can't fail!")

Go read it. I'll wait. #

I'm really happy with rebooting this blog using Cloudflare Pages. In an attempt to build a more interesting site with interactive content (e.g., embedded YouTube videos, Twitter and Linked in blocks etc.) I've found that the size of the site (and the bloated-ness) of the site has increased dramatically. It's 100% due to how I'm using JavaScript to embed those external services. This is a report that I just ran this morning:

I think that in the future, I need to teach vscode-zola how to embed an image of, say the YouTube video, the Twitter post or the Instagram post with a link to the original instead of doing these crazy embeddings. The fact that my web page is 6.4MB is inconceivable. #