Skip to content

John Lam's Blog

2022-01-06

Optimizing Docker image sizeđź”—

There's a nice blog on optimizing Docker image size. I learned about tools that I haven't known about before, especially dive. This lets you troubleshoot your Docker images and even tells you when you have files duplicated in different layers in your image. It's a great way to discover waste in your images.

Animated gif showing dive

The HN thread also has a bunch of useful tips for managing layers:

You can use --squash to remove all intermediate layers

The Docker BuildKit tool supports heredocs!

RUN <<EOF
apt-get update
apt-get upgrade -y
apt-get install -y ...
EOF

There's a tool called stargz (seekable tar.gz format) which greatly improves startup time in container images by lazy loading files stargz snapshotter

Look at how it improves startup time for python-3.7 image:

This FOSDEM 2021 presentation is a good overview of stargz

This is built on top of the Google CRFS (Container Registry Filesystem) project which is a FUSE filesystem that lets you mount a container image served directly from a container registry without pulling it locally first. The linked repo has a fantastic README that explains how it all works.

2022-01-05

On the importance of empathyđź”—

In case we need to be reminded of it, not everyone who we disagree with is motivated by some evil thing. In the words of Ted Lasso (not actually Walt Whitman):

You know Rupert, guys have underestimated me my entire life. And for years I never understood why. It used to really bother me. But then one day I was driving my little boy to school and I saw this quote from Walt Whitman painted on the wall there that said “Be curious. Not Judgmental.” I like that.

So I get back in my car and I’m driving to work and all of sudden it hits me. All them fellas who used to belittle me, not a single one of them was curious. You know, they thought they had everything figured out. So they judged everything. And they judged everyone. And I realized that their underestimating me, who I was had nothing to do with it. Because if they were curious, they would have asked questions. You know. Questions like, have you played a lot of darts, Ted? Which I would have answered. Yes sir. Every Sunday afternoon at a sports bar with my father from age 10 to 16 when he passed away.

Barbecue sauce.

This quote is from this excellent scene from Ted Lasso:

Sadly, there's been a lot of non-empathetic judgemental behavior during the pandemic. A lot of it from me, I'm ashamed to admit.

I've really enjoyed watching the growth of Lex Fridman ever since I discovered him a few years ago. Recently, he's begun to talk more about the importance of empathy and to be open and really listen to other people in the absence of judgement.

A relevant quote from the opening:

Those who advocate for lockdowns as a policy often ignore the quiet suffering of millions that it results in which includes economic pain, loss of jobs that give meaning and pride in the face of uncertainty ... Many folks whose job is unaffected by the lockdowns talk down to the masses about which path forward is right and which is wrong. What troubles me most is this very lack of empathy among the policymakers for the common man and in general for people who are unlike themselves.

I had a really hard time with the realization that comes from listening to this and subsequently looking at myself in the mirror. Divisiveness is not the answer to any of the problems that we have. Nor should public campaigns be waged to silence the heretics. This is a good reminder to me and I will strive to do better.

2022-01-03

The AsmTutor site is a great introduction to NASM. I especially like how it begins with making Linux syscalls.

The real sequel to On Intelligenceđź”—

About 20 years ago, I read Jeff Hawkins' excellent book On Intelligence. One of the things that really struck me back then was his description of how we could map different sensors to the neocortex, effectively creating a cyborg. I had always thought that this meant a physical fusion of human and machine. But then I read this quote and realized ... that we are already cyborgs.

The thing that people, I think, don’t appreciate right now is that they are already a cyborg. You’re already a different creature than you would have been twenty years ago, or even ten years ago. You’re already a different creature. You can see this when they do surveys of like, “how long do you want to be away from your phone?” and—particularly if you’re a teenager or in your 20s—even a day hurts. If you leave your phone behind, it’s like missing limb syndrome. I think people—they’re already kind of merged with their phone and their laptop and their applications and everything.

This quote is from the excellent Neuralink and the Brain's Magical Future feature from Tim Urban, which I now regard as the real successor to On Intelligence (don't bother with Jeff Hawkins sequel - A Thousand Brains). This 38,000 word epic builds a compelling case for why we need to create a Brain-Machine-Interface, starting with a very entertaining and interesting introduction to neuroscience.

There's a very large hand wave in the middle of the article where he transitions from a discussion of how we got here to where we could go. Building a BMI is a very hard problem and he does a great job at explaining through analogy exactly why it is such a hard problem, and also motivating why we need to tackle this problem.

While some may look at the article as a lot of Elon fanboyism, I think it does a great job at capturing what motivates Elon. It took me several hours and a couple of days to read the entire thing. But it's well worth the investment in time.

2022-01-02

How can we build better computational assistants?đź”—

There is a lot of ceremony in note-taking systems. Some examples:

A lot of the ceremony is due to the need to maintain the system itself. Yes, there are tools like Obsidian and Roam Research that help to maintain these systems, but they are largely tools that make it more efficient to maintain the system.

Perhaps an interesting thought experiment (I don't have solutions here, I'm just trying to write down the right questions to ask) would be to imagine what a Star Trek like conversation with a super-intelligent machine could look like?

ME: Hey Jarvis, teach me about super intelligence JAVIS: How does it feel like to be standing here?

Human Progress over Time

JARVIS: {more smart things}

That drawing is from the beginning of Tim Urban's excellent 95,000 word magnum opus of a 4-part blog post. I've been obsessed (over the past hour or so) with Tim's storytelling virtuosity. What Tim seems to do extremely well is take very complex systems and boil them down into simple, accessible, yet entertaining stories. As I think about what a super-intelligent note taking system could be, could it be something that could take our boring notes about something and synthesize it into something incredibly interesting and entertaining like what Tim is capable of creating? Could we have our own "Tim in a Box"?

I must say that this is just one possible take, but I'm pretty sure that whatever our future hyper-intelligent assistants will be, it won't look like a VS Code interface.

Incidentally, this idea came about when I was watching Tim's Google Talk this morning:

Myelinđź”—

This is a beautiful picture of what a myelinated axon looks like:

2022-01-01

I miss Euan Garden.

Euan Garden

Euan was a friend, a colleague, and a former boss. He's one of the first people who pop into my mind when I think of the phrase "a man of character". I can remember so many stories from our many conversations over the years, starting with the first time that I met him when he was interviewing to become my boss in the Technical Computing Group at Microsoft over a decade ago. It was a normal interview until the end when he started fanboying over some work that I had done earlier with Delphi and COM. I think that anyone who has met Euan before could imagine the AMOUNT OF ENTHUSIASM with which said fanboying was done. But he did it not in a way that would make him seem like he was angling for the job (despite the circumstances) but in the true spirit of love for the technology. During that time when Euan was my boss on that amazing team so long ago he was a great boss, one that would at the same time give me room to explore crazy ideas and the wisdom to step in when my crazy ideas would go out of control :)

At some point, it became clear to all of us that Technical Computing was not going to be a thing at Microsoft. When I was recruited into the Windows team (hi, Mahesh!) coincidentally to own COM and to build the new WinRT programming model on top of it, I had to break the news to Euan over the holidays. I needed to do it in person, so I remember driving over to his old house (he wound up moving across the street to his new house believe it or not) to break the news. As you might expect, Euan was gracious and helpful during the transition as that was going to become my first experience with being a manager at Microsoft.

We were never too far away from each other over the years. I remember random conversations with him that included:

  • How to talk to the police if you happen to have shot an intruder in your house
  • Why driving to work in a scooter while wearing shorts was the only way to travel in December in Redmond
  • Why having an Xbox in every room in your house as a media extender was the only way to live
  • Why Thinkpads are the ONE TRUE LAPTOP

We re-united on the same (larger) team when I joined the Python team after my stint in Windows. It was great having him just upstairs from me where I could drop in to get his sage advice on many things as I sought to be helpful to my new team. His work then was deeply influential on some things that I'm working on now. When we ship, it's going to be in no small part due to our many chats over the years. I'm proud to be able to carry forward his work and I'll continue to think of him often as we continue on his journey.

Godspeed, Euan.

PS others have shared their memories of Euan

More on the metaverseđź”—

So I was procrastinating today after discovering Tim Urban on Twitter (how is it possible that I did not know he existed?):

I went down the rabbit hole and discovered his TED Talk:

And his incredibly funny writeup of what it was like to write his TED Talk.

I was watching it on my phone while sitting on my couch next to my dog who loves to sleep on my couch. Which got me thinking about what this experience would be like if I were watching this in VR. It struck me then, in the middle of watching a talk on procrastination, that at limits of technology, it wouldn't look any different at all.

That's a problem. It also got me thinking about how early movies looked no different than watching a play at a theater ... they haven't invented editing yet! So they simply recorded what it was like to watch a play, which made it possible for people who were not in the room at the time that the play was showing to experience the play, albeit in a somewhat degraded experience.

My experience in Horizon Meeting Rooms yesterday was one where it feels like those early movies - we are literally putting people into a 3D room to experience what it would be like to be in the same place physically. That to me feels like those early movies - we made it possible for people to experience a play without being in the same space physically. But why should we be constrained by the limitations of reality? What is the equivalent of editing in the metaverse? I don't have any answers here, but at least I have figured out a question and that's a start.

(An aside) I viscerally remember this thought popping into my head while I was watching Tim's talk. I think this is another great example of Beer Mode thinking which I think is extremely important for creative professionals. Random inputs can stimulate thinking in different and unexpected ways, especially with ideas that are tumbling around in your subconcious. Or at least that's how I'm justifying procrastination.

Spatial context and VRđź”—

Here's a sequence of photos that I took today at the Microsoft Library:

Bookshelves Bookshelf Shelf Book Summary Table of Contents Detail Table of Contents Page Page Detail

They represent zooming in on a piece of information. I find that my memory works this way - I can remember where something is located spatially, often times even in cases where I cannot remember the information itself.

Today, VR-based solutions like Horizon Workrooms create meeting spaces that mimic existing meeting rooms. Since I haven't yet had meetings inside of a Workroom yet, I can't comment on the experience first-hand. But watching some YouTube videos it seems like they're striving to create an experience that mimics a real-world meeting room, which is great start and requires a tremendous amount of engineering effort to get right.

But let's imagine that we're not limited by what we can do in a physical meeting room. What if we could take advantage of our spatial memory and leave objects around the room? What if those objects could be zoomed into "infinitely" to reveal more about what they represent? What if we could version control the state of the objects in the room?

One of the great things about being messy, like these famous desks is that they respresent saved context in a physical space. I would like to imagine that those piles represent trains of thought and can help bring them back into the context (or at least remind them of the context!)

Albert Einstein's desk Tony Hsieh's desk Steve Jobs desk

Modern information is mainly accessed in one of two ways: by search or through storing files in a file system. The problem with both of these approaches is that neither of them take advantage of our natural spatial awareness or our ability to organize information into hierarchies (you're already good at this

  • you can probably tell me exactly where in your house your toothbrush, your keys, or the milk is located).

What's worse is that they also remove any cues you may have that the information exists at all. What would be great is if we could have a system where you could arrange the information spatially, and that spatial information could be version controlled. This way the context is maintained.

Shared meeting room spaces are reset for each meeting. However, what if we can always recreate (or even version control) different meeting contexts? If I met with you last week, when we re-enter the meeting room everything is exactly where we left it from our last visit. This way we can take advantage of our spatial memory to organize information in a virtual space instead of relying on search or other cumbersome window-based ways of retrieving information. We could also integrate this with search "find my car keys in the house", which would reveal the item and its location.

This way we can transcend what is possible spatially and not create a world that is merely a lower-fidelity imitation of the real world. I get that the primary benefit that early adopters get from Workrooms is going to be allowing people that are distant to collaborate in a more meaninful way, much like those early films allowed people not in the room to experience the play. That's already super-valuable, especially in today's world. But perhaps spatial context could be akin to editing in films and create an entirely new medium in which we can collaborate?

2021-12-31

I went to Best Buy this afternoon and bought an Oculus Quest 2 because I wanted to spend some time this holiday seeing how real Ben Thompson's take on the Metaverse was. I'm pretty amazed at what they were able to cram into a $299 device. To quote:

My personal experience with Workrooms didn’t involve any dancing or fitness; it was simply a conversation with the folks that built Workrooms. The sense of presence, though, was tangible. Voices came from the right place, thanks to Workrooms’ spatial audio, and hand gestures and viewing directions really made it feel like the three of us were in the same room. What was particularly compelling was the way that Workrooms’ virtual reality space seamlessly interfaced with the real world:

People joining a meeting without a headset appear on a TV as if they are video conferencing; it feels completely natural. Here is a photo from Facebook’s PR pack (which honestly, given the size of the group, seems less immersive; my demo had one person on video, and one person in the room with me):

I'm writing this back on my PC after spending about 30 minutes in a Horizon Workroom by myself. So here's a really quick take on what I liked and didn't like about the experience:

The immersive feeling is real. I really like how, with Horizon Remote desktop installed on my PC, I could interact with and type on my PC.

The experience with the whiteboard was ... interesting. I get that this is using $299 worth of consumer-level hardware, so I'm not expecting a whole lot. You can write on the whiteboard using one of the controllers flipped around so that you're using the bottom like a pen. This was ... OK, but clearly not as good as it could be with better hardware. However I am holding my arm out in space vs. bracing it against a real whiteboard so I'm not sure how better hardware can help in this regard.

The latency while using my keyboard and mouse was pretty jarring. Now I totally respect that I can actually do things with this, but this is very early adopter territory. Oculus Remote Desktop is pretty confused with multi-monitor setups though (Aero snap on Windows doesn't do what I would expect), and I need to take off my headset and manually move windows to my primary monitor to get it to work correctly. It is usable though. I wonder if it's any better with USB tethering to my PC?

The resolution is limiting. I get that this will get better over time but it likely needs to be a LOT better. I'm looking at two 27" 4K monitors right now and that's my ideal experience. I would imagine 25MP per eye will get us pretty close to the experience I have already, but that's 7x more pixels than I'm seeing right now at 1832x1920 per eye, which is still quite amazing considering it's a $299 device!

Even though I wouldn't call the Quest heavy, it does have a noticeable heft on my head. I wonder how well this will hold out during a meeting (I have one scheduled for later this week with a friend to see how well this works).

I'd really like the device to do a better job tracking where my hands are and letting me see them as I type on my keyboard. I hit keys like F10/F11 all the time and I can't really touch type those keys as it's quite a reach from the home position (try it yourself to see what I mean). I'd also like to see my mouse as well - I would imagine that this will get much better in the future.

But back to the latency - this is probably the biggest technical issue that I see with the hardware right now. But there's promise here. It feels very Christensen disruptive (if the collaborative room experience is as good as Ben claims).

I'll try it again later tonight. But this does feel like the future. But we're definitely not there right now.

2021-12-30

This is a fantastic piece from NYT Opinion that asks a simple question: in the 18 states in the US where Democrats have absolute power, do they live the values espoused by their party? It examines three key issues: affordable housing (California), progressive taxation (Washington), and education (Illinois). The results will be surprising to you. I didn't know that Washington state is the most regressive taxation state in the nation.

We continue to focus on cases now that Omicron is raging across the country. But is this the right metric for us to be looking at? We're saying "OMG, Omicron is setting new daily records". But given that Omicron is much less virulent:

and the recent data from South Africa shows a dramatic decoupling in deaths and cases compared to previous variants:

Why are we continuing to scare people with hyperbolic language like "global dominance"?

It's meaningless to compare case numbers from a more virulent but less transmissible variant to a less virulent but more transmissible variant unless you're just trying to scare people. Perhaps it's time for a better metric - it seems like hospitalizations are a much more reasonable metric to be looking at to a first approximation. Here is Bob Wachter's more reasonable look at hospitalizations (which of course is a lagging metric to cases):

Also, Bob has a hopeful take on Omicron. Hopefully Omicron continues to outcompete Delta and becomes the variant that becomes endemic in the population. Maybe, as Bob suggests, COVID becomes "just like a bad flu" by the Spring. We can only hope.

2021-12-27

This is the best explainer of how Airline frequent flyer programs really work. Airlines are effectively sovereign currency issuers with the caveat that not only do they control issuing currency, they also control the only means by which the currency can be redeemed. While this is not entirely true, i.e., you can purchase goods and services by redeeming frequent flyer miles, you'd have to be crazy to do so given the terrible exchange rates that are offered by the credit card issuers.

This diagram does a great job at explaining how Wall Street values airlines and their frequent flyer programs. Effectively operating an airline is a loss- leader for providing frequent flyer miles! Now, there's an arbitrary multiplier applied over EBITDA to come up with the valuation of the frequent flyer programs but this is still a stunning figure:

The video also does a great job at explaining how different forms of arbitrage have been systematically eliminated by the airlines. For example, mile arbitrage (i.e., find a cheap flight that went on a circuitious route that would yield a large number of miles for a low cost) have been eliminated by pegging reward miles to the dollar spend vs. the miles flown. They also do the same thing on the redemption side as well. I used to run an arbitrage on this many years ago when I could fly anywhere at anytime for 25K miles. I would charge customers for the flight at a discount over the current flight cost, e.g., if a flight cost $1200, I would sell for $1000 and redeem 25K miles for $1000 for an unheard-of $.04 per mile. Yes, I would get taxed on the income for that flight but that was still an unheard-of redemption rate for miles.

One of the challenges of building a semantic search engine is splitting the input text into smaller chunks that are suitable for generating embeddings using Transformer models. This thread on the Huggingface forums does a good job at breaking down the problem into smaller pieces. The key insight from lewfun is using a sliding window algorithm over the text in the document. For those who don't know, there is a limit on the number of tokens (roughly words) that can be fed to a model. By using a sliding window algorithm, you block the entire document and can map each block back to the same original document. This way, you can use similarity based models to generate a ranked list that maps back to the original document. This will be the way that I will build the first version of my semantic search engine.

2021-12-22

I've been interested in using Machine Learning to extract text reliably from a web page for archival purposes (adding to my personal knowlegebase). So today, I'm collecting some links to prior art in this area for some inspiration:

A Machine Learning Approach to Webpage Content Extraction. This paper uses support vector machines to train a model that uses some specific features in the text block:

  • number of words in this block and thee quotient to its previous block
  • average sentence length in this block and the quotient to its previous block
  • text density in this block and the quotient to its previous block
  • link density in this block

Readability.js. This is an node library that contains the readability library used for the FireFox Reader view. In a web browser, you must directly pass the document object from the browser DOM to the library. If used in a node application, you'll need to use an external DOM library like jsdom. Either way the code is simple:

var article = new Readability(document).parse();

or in the case of a node app:

var { Readability } = require('@mozilla/readability');
var { JSDOM } = require('jsdom');
var doc = new JSDOM("<body>Look at this cat: <img src='./cat.jpg'></body>", {
  url: "https://www.example.com/the-page-i-got-the-source-from"
});
let reader = new Readability(doc.window.document);
let article = reader.parse();

Ideally I should create an API that accepts a URI as a parameter and returns the parsed document to the caller. Invoking this from a Chrome extension should make it very straightforward to "clip" a web page into a personal knowledgebase.

Large language models like GPT-3 show real promise in this area as well. In this article, the authors use GPT-3 to answer questions based on text extracted from a document. Starting at 2:47 in the video below is a great demo of this working

2021-12-21

Day 3 of fastai. The book arrived today and I'm following along from the book directly. A combination of reading the paper book and using the Jupyter notebooks found in the GitHub book repo are a potent combination. Reminder that you can run the book repo in just a single step using ez and VS Code.

The Universal Approximation Theorem undergirds the theoretical basis for neural networks being able to compute arbitrary functions. The two parts of the approximation theorem look at the limits of a single layer with an arbitrary number of neurons ("arbitrary width") and the limits of a network with an arbitrary number of hidden layers ("arbitrary depth").