Skip to content

John Lam's Blog

2022-01-17

The Charisma Machine

The Charisma Machine by Morgan Ames chronicles the failure of the One Laptop Per Child project. A telling quote from the book's website:

Drawing on a seven-month study of a model OLPC project in Paraguay, this book reveals that the laptops were not only frustrating to use, easy to break, and hard to repair, they were designed for “technically precocious boys”—idealized younger versions of the developers themselves—rather than the diverse range of children who might actually use them.

It's pretty clear that the leaders of the project fell in love with an idea. One where the solution they were creating matched the problems that the leaders personally had - a desire for a machine like OLPC when they were children. They assumed, incorrectly, that all children in developing countries had the same problem that their solution was designed for. Ames describes this as nostalgic design.

So instead of validating the problem, they went ahead and burned hundreds of millions of dollars in pursuit of a solution to a problem that apparently did not exist. Technology is not the solution to all of our problems, especially in education. But perhaps more interesting for those of us who work in product development is understanding why Negroponte was wrong? What could he have done differently in retrospect? Could the many failures encountered in the field have been discovered sooner? This quote captures Negroponte's hubris:

The days of pilot projects are over. When people say, ‘Well, we’d like to do three or four thousand in our country to see how it works.’ Screw you. Go to the back of the line and someone else will do it, and then when you figure out that this works, you can join as well.

Developing and deploying the solution as quickly as possible was clearly the end-goal here, not understanding the needs of the customer:

We’ll take tablets and drop them out of helicopters into villages that have no electricity and school, then go back a year later and see if the kids can read.

Could they have shipped a better product that fit the needs of their customers if they had only spent some time trying to understand their customers first? One can only imagine what might have been had they pursued a different path, guided by the real needs of their customers. A fantastic case study of what not to do when building a product.

2022-01-13

Multi-Agent Hide and Seek

This is a great visual demonstration of the progress of learning in self- supervised learning via reinforcement learning. Adding human-like behaviors to the simulation make it far more engaging to me!

The accompanying Open AI blog post does a great job of annotating the different skills learned along the way. The reward function is simple:

Agents are given a team-based reward; hiders are given a reward of +1 if all hiders are hidden and -1 if any hider is seen by a seeker. Seekers are given the opposite reward, -1 if all hiders are hidden and +1 otherwise.

From this simple reward function, many strategies emerge and many surprising "hacks" where the agents exploit loopholes in the simulation. The section on Surprising Behaviors shows these clearly.

But given the extremely large number of episodes that lead to the behaviors, I wonder how they were able to see these behaviors?

UPDATE: I just discovered the Two Minute Papers channel on YouTube and they do a great job of summarizing the paper in this video. Subscribed.

2022-01-10

The origins of yak shaving

OK so I find it amusing that, in the process of hunting down a reference for the algorithm that Donald Knuth used to reward bugs in TAOCP, I found the earliest documented mention of the term yak shaving by Carlin Vieri:

From: "Jeremy H. Brown"
To: [email protected]
Subject: GSB: 5:30pm, 7ai playroom
Date: Fri, 11 Feb 2000 16:00:56 -0500 (EST)

"Yak shaving."  Our very own Carlin Vieri invented the term, and yet
it has not caught on within the lab.  This is a shame, because it
describes all too well what I find myself doing all too often.


You see, yak shaving is what you are doing when you're doing some
stupid, fiddly little task that bears no obvious relationship to what
you're supposed to be working on, but yet a chain of twelve causal
relations links what you're doing to the original meta-task.


Here's an example:


"I was working on my thesis and realized I needed a reference.  I'd
seen a post on comp.arch recently that cited a paper, so I fired up
gnus.  While I was searching the for the post, I came across another
post whose MIME encoding screwed up my ancient version of gnus, so I
stopped and downloaded the latest version of gnus.  


"Unfortunately, the new version of gnus didn't work with emacs 18, so
I downloaded and built emacs 20.  Of course, then I had to install
updated versions of a half-dozen other packages to keep other users
from hurting me.  When I finally tried to use the new gnus, it kept
crapping out on my old configuration.  And that's why I'm deep in the
gnus info pages and my .emacs file -- and yet it's all part of working
on my thesis."


And that, my friends, is yak shaving.  (Not that this particular
example happened to me recently or anything.)  


Another example of yak shaving might be "I need to ask  a
question about our group project.  <Person>'s hard to track down.  But
I know <person> will be at GSB tonight, so I'll go there too."  You
see, in this case GSB isn't socializing or slacking off, it's part of
your work -- hanging out there, you'll be connected to your project
via the process of yak shaving!


Now, having been introduced to the term and the concept of yak
shaving, and also having had it pointed out to you that attending GSB
is actually an important part of your research, you are morally bound
to come to GSB tonight and discuss the finer points of yak shaving,
and in particular the subtle distinctions between some forms of "yak
shaving" and "flimsy rationalizations".


****************** G I R L  S C O U T  B E N E F I T ******************
******************        5:30pm  7AI Playroom       ******************

2022-01-09

Donald Knuth as the patron saint of yak shaving

Once upon a time, there was Donald Knuth. He's a well-known Professor, Computer Scientist, and Author of TAOCP, his life's work. Every day he worked on TAOCP as a digital recluse. While progress sometimes slowed down (3 volumes were completed by 1999), he does have a schedule for completing it.

His work was not un-noticed. TAOCP was named one of the twelve physical science books of the century in 1999 by American Scientist, alongside other notable works such as Dirac on quantum mechanics, Einstein on relativity, Mandelbrot on fractals. And so he kept working until one day in 2015, someone noticed that there were lots of other things that Knuth created along the way in support of his life's work.

The activities that led to the creation of these other things are sometimes called yak shaving:

Yak shaving refers to a task, that leads you to perform another related task and so on, and so on — all distracting you from your original goal. This is sometimes called “going down the rabbit hole.

The first time I heard of this term was at a RubyConf many years ago. Not surprisingly, Knuth's behavior was noticed by a member of the Ruby community: Florian Gilcher who penned this wonderful post on Knuth's yak shaving activities. Knuth's most notable accomplishment outside of TAOCP is TeX, a typesetting language that he invented to typeset TAOCP. So if you consider TeX the original yak shave, it led to all these other yak shaves that are lovingly chronicled in Florian's post:

  • Write your own programming language to implement TeX in (WEB)
  • Invent a new programming paradigm (literate programming - a predecessor to today's RMarkdown and Jupyter) for WEB
  • Invent a new algorithm for text layout
  • Design a font (Computer Modern)
  • Invent a new authoring tool for fonts (METAFONT)
  • Implement a new device-independent layout engine (DVI)

The HN post also had this wonderful comment from David Fuchs who told a story about even more yak shaving by Knuth(!):

This and more. The Sail version of TeX is mentioned, but everything else had a usable prototype, too: An early version of Metafont in Sail, an early version of Web (called Doc, though I don’t recall what the names of the Tangle and Weave equivalents were), and an early version of Computer Modern fonts in the early Metafont language. Though fully documented and available via anonymous FTP on the Arpanet, all of these prototypes were experimental proofs-of-concept and were completely discarded, along with the languages they implemented, with the “real” versions rewritten from scratch by Knuth (each first by hand on legal pads, then typed in and debugged as a whole.)

And you missed one more obscure but instructive example: To get the camera-ready copy for ACP Vol 2, Knuth purchased an Alphatype CRS phototypesetter. And, unhappy with the manufacturer’s firmware, he rewrote the 8080 code that drives the thing. Eight simultaneous levels of interrupts coming from various subsystem: the horizontal and vertical step-motors that you had to accelerate just right, while keeping synchronized with four S100 boards that generated slices of each character in realtime from a proprietary outline format, to be flashed into the lens just as it was passing the right spot on the photo paper. (Four identical boards, since you had to fill the second while the first one did its processing; times two for two characters that might overlap due to kerning). Oh, and you had to handle memory management of the font character data since there wasn’t enough RAM for even one job’s worth (see our joint paper).

Fun times. I did the driver on the mainframe side, so got to be there for the intense debugging sessions that used only the CRS’s 4x4 hex keypad and 12-character dot-matrix display.

Thanks for the DVI shout-out, btw.

I was fortunate enough to meet Knuth at useR 2016 when it was held on the campus of Stanford University. At the end of his session on literate programming, he rode away on a bicycle carrying an enormous stack of books precariously perched on one hand. Keep in mind he was 77 years old at the time! Sadly the only photo I have of Knuth from his talk was this one where, apparently, even the great Donald Knuth needs a bit of help to project from his computer!

That trip marked the first time that I rode a mountain bike. I couldn't have picked a better place to do this on the amazing Arastradero Preserve:

2022-01-07

GPT-3 and long form question answer by OpenAI

Note to myself: today Dickie Bush reminded me of the Pixar Storytelling framework. I'm going to try rewriting this post that way to see if it's any better at telling the story.

Today's paper is WebGPT: Browser-assisted question-answering with human feedbac. When I think about thow we can build a bicycle for the mind, I've been wondering how machine learning can help and mused about as much while wondering if it is possible to create a Tim (Urban)-In-A-Box. More important than summarizing or finding information is getting you to read that summary.

There are a number of things that I learned in this paper. First is that there is a high quality dataset, ELI5 sourced from the popular sub-reddit that contains a list of questions and answers "explained like I am 5 years old".

I learned about crows

Pixar version

Once upon a time, the Internet was a place where I would manually do research, often times on the toilet. Every day I would click on links, often times on social media, and read post after post after post, slowly building up a mental model of the things I am researching.

But if I rely on beer mode activities to drive my research, it's going to be pretty random. And since there isn't anything that tries to tie together the things that I have seen on a particular topic, my poor brain is left to fend for itself. So it sees input that looks something like:

  • GPT-3 paper
  • GPT-3 funny meme
  • Picture of cat
  • Shitpost from friend on Twitter
  • OMG COVID will kill us all
  • Spiderman spoiler
  • Introduction to Transformers
  • Transformers: Dark side of the mooon trailer
  • Attention is all you need paper

But one day, I thought: why don't I get my machines to go do the heavy beer mode activity of sifting through the Internet and return me something that I could read without distractions, preferrably written in a Tim Urban style? Just how far-fetched is this idea? Turns out, we're getting closer by the day.

On December 16, OpenAI announced WebGPT, which is a fine-tuned version of GPT-3 that answers open-ended questions by searching for and summarizing what it finds online. Unlike GPT-3, WebGPT augments what is already known in its language model with results obtained by searching the web using the Bing search API.

The results from the model are pretty interesting. When posed the question: "How do neural networks work?", WebGPT responded with:

In its most basic form, a neural network has two layers: an input layer and an output layer[3]. The output layer is the component of the neural network that makes predictions[3]. In a feedforward network, information flows through the network in the following way: patterns of information are fed into the network via the input units, which trigger the layers of hidden units, and these in turn arrive at the output units[1]. The network learns by a feedback process called backpropagation, which involves comparing the output a network produces with the output it was meant to produce, and using the difference between them to modify the weights of the connections between the units in the network, working from the output units through the hidden units to the input units, going backward[2][4]. Over time, backpropagation causes the network to learn, reducing the difference between actual and intended output to the point where the two exactly coincide, so the network figures things out exactly as it should[2].

This is pretty reasonable! It even generated references:

  1. How neural networks work - A simple introduction (www.explainthatstuff.com)
  2. How neural networks work - A simple introduction (www.explainthatstuff.com)
  3. How Do Neural Networks Really Work? | Nick McCullum (nickmccullum.com)
  4. How Do Neural Networks Really Work? | Nick McCullum (nickmccullum.com)

The linked paper provides a lot of additional details. First, the experimental environment (used by humans recruited via UpWork) is a text-based web browser. Here's a screenshot of what it looks like:

text based browser used by webgpt humans

The text based browser was designed to be used by human participants, and did a lot that a conventional browser does not typically do to make it easier to work with the text. It uses a number of existing libraries, including:

to transform the text into a format that was suitable to feed into the model. The browser was used to collect two additional pieces of data from the study participants: 1) demonstrations of humans using search to answer questions, and 2) comparisons between two model-generated answers to the same question.

The model itself was trained to answer questions from the ELI5 dataset that was gathered from questions and answers in the "Explain Like I'm Five" subreddit.

The results from this study are encouraging - the model's answers are preferred 56% of the time. When comparing answers to the highest-voted answer from the ELI5 dataset, the model's answers were preferred 69% of the time. This is not human-level performance yet, but it's getting there.

Because of this, Because of this,

Until finally the glorious day arrives when I can ask a question like:

Have Tim Urban write an explainer for What is Neuralink and why do we need it?

And I get an answer like this amazing 38,000 word explainer for my question. Or if I'm more in the mood for something more succinct:

Give me a Wikipedia summary for What is Neuralink and why do we need it?

But will it make any difference? Will it actually make people smarter? Think better? Will this insulate us from the endless distraction that is the internet?

Maybe what we need is an assistant that filters the web for us but in a way that doesn't reinforce existing biases? I should write about that some other time.

Brian Kernighan's memoir

TIL that UNIX: A History and a Memoir was written by Brian Kernighan and available on Kindle as an e-book.

2022-01-06

Optimizing Docker image size

There's a nice blog on optimizing Docker image size. I learned about tools that I haven't known about before, especially dive. This lets you troubleshoot your Docker images and even tells you when you have files duplicated in different layers in your image. It's a great way to discover waste in your images.

Animated gif showing dive

The HN thread also has a bunch of useful tips for managing layers:

You can use --squash to remove all intermediate layers

The Docker BuildKit tool supports heredocs!

RUN <<EOF
apt-get update
apt-get upgrade -y
apt-get install -y ...
EOF

There's a tool called stargz (seekable tar.gz format) which greatly improves startup time in container images by lazy loading files stargz snapshotter

Look at how it improves startup time for python-3.7 image:

This FOSDEM 2021 presentation is a good overview of stargz

This is built on top of the Google CRFS (Container Registry Filesystem) project which is a FUSE filesystem that lets you mount a container image served directly from a container registry without pulling it locally first. The linked repo has a fantastic README that explains how it all works.

2022-01-05

On the importance of empathy

In case we need to be reminded of it, not everyone who we disagree with is motivated by some evil thing. In the words of Ted Lasso (not actually Walt Whitman):

You know Rupert, guys have underestimated me my entire life. And for years I never understood why. It used to really bother me. But then one day I was driving my little boy to school and I saw this quote from Walt Whitman painted on the wall there that said “Be curious. Not Judgmental.” I like that.

So I get back in my car and I’m driving to work and all of sudden it hits me. All them fellas who used to belittle me, not a single one of them was curious. You know, they thought they had everything figured out. So they judged everything. And they judged everyone. And I realized that their underestimating me, who I was had nothing to do with it. Because if they were curious, they would have asked questions. You know. Questions like, have you played a lot of darts, Ted? Which I would have answered. Yes sir. Every Sunday afternoon at a sports bar with my father from age 10 to 16 when he passed away.

Barbecue sauce.

This quote is from this excellent scene from Ted Lasso:

Sadly, there's been a lot of non-empathetic judgemental behavior during the pandemic. A lot of it from me, I'm ashamed to admit.

I've really enjoyed watching the growth of Lex Fridman ever since I discovered him a few years ago. Recently, he's begun to talk more about the importance of empathy and to be open and really listen to other people in the absence of judgement.

A relevant quote from the opening:

Those who advocate for lockdowns as a policy often ignore the quiet suffering of millions that it results in which includes economic pain, loss of jobs that give meaning and pride in the face of uncertainty ... Many folks whose job is unaffected by the lockdowns talk down to the masses about which path forward is right and which is wrong. What troubles me most is this very lack of empathy among the policymakers for the common man and in general for people who are unlike themselves.

I had a really hard time with the realization that comes from listening to this and subsequently looking at myself in the mirror. Divisiveness is not the answer to any of the problems that we have. Nor should public campaigns be waged to silence the heretics. This is a good reminder to me and I will strive to do better.

2022-01-03

The AsmTutor site is a great introduction to NASM. I especially like how it begins with making Linux syscalls.

The real sequel to On Intelligence

About 20 years ago, I read Jeff Hawkins' excellent book On Intelligence. One of the things that really struck me back then was his description of how we could map different sensors to the neocortex, effectively creating a cyborg. I had always thought that this meant a physical fusion of human and machine. But then I read this quote and realized ... that we are already cyborgs.

The thing that people, I think, don’t appreciate right now is that they are already a cyborg. You’re already a different creature than you would have been twenty years ago, or even ten years ago. You’re already a different creature. You can see this when they do surveys of like, “how long do you want to be away from your phone?” and—particularly if you’re a teenager or in your 20s—even a day hurts. If you leave your phone behind, it’s like missing limb syndrome. I think people—they’re already kind of merged with their phone and their laptop and their applications and everything.

This quote is from the excellent Neuralink and the Brain's Magical Future feature from Tim Urban, which I now regard as the real successor to On Intelligence (don't bother with Jeff Hawkins sequel - A Thousand Brains). This 38,000 word epic builds a compelling case for why we need to create a Brain-Machine-Interface, starting with a very entertaining and interesting introduction to neuroscience.

There's a very large hand wave in the middle of the article where he transitions from a discussion of how we got here to where we could go. Building a BMI is a very hard problem and he does a great job at explaining through analogy exactly why it is such a hard problem, and also motivating why we need to tackle this problem.

While some may look at the article as a lot of Elon fanboyism, I think it does a great job at capturing what motivates Elon. It took me several hours and a couple of days to read the entire thing. But it's well worth the investment in time.

2022-01-02

How can we build better computational assistants?

There is a lot of ceremony in note-taking systems. Some examples:

A lot of the ceremony is due to the need to maintain the system itself. Yes, there are tools like Obsidian and Roam Research that help to maintain these systems, but they are largely tools that make it more efficient to maintain the system.

Perhaps an interesting thought experiment (I don't have solutions here, I'm just trying to write down the right questions to ask) would be to imagine what a Star Trek like conversation with a super-intelligent machine could look like?

ME: Hey Jarvis, teach me about super intelligence JAVIS: How does it feel like to be standing here?

Human Progress over Time

JARVIS: {more smart things}

That drawing is from the beginning of Tim Urban's excellent 95,000 word magnum opus of a 4-part blog post. I've been obsessed (over the past hour or so) with Tim's storytelling virtuosity. What Tim seems to do extremely well is take very complex systems and boil them down into simple, accessible, yet entertaining stories. As I think about what a super-intelligent note taking system could be, could it be something that could take our boring notes about something and synthesize it into something incredibly interesting and entertaining like what Tim is capable of creating? Could we have our own "Tim in a Box"?

I must say that this is just one possible take, but I'm pretty sure that whatever our future hyper-intelligent assistants will be, it won't look like a VS Code interface.

Incidentally, this idea came about when I was watching Tim's Google Talk this morning:

Myelin

This is a beautiful picture of what a myelinated axon looks like:

2022-01-01

I miss Euan Garden.

Euan Garden

Euan was a friend, a colleague, and a former boss. He's one of the first people who pop into my mind when I think of the phrase "a man of character". I can remember so many stories from our many conversations over the years, starting with the first time that I met him when he was interviewing to become my boss in the Technical Computing Group at Microsoft over a decade ago. It was a normal interview until the end when he started fanboying over some work that I had done earlier with Delphi and COM. I think that anyone who has met Euan before could imagine the AMOUNT OF ENTHUSIASM with which said fanboying was done. But he did it not in a way that would make him seem like he was angling for the job (despite the circumstances) but in the true spirit of love for the technology. During that time when Euan was my boss on that amazing team so long ago he was a great boss, one that would at the same time give me room to explore crazy ideas and the wisdom to step in when my crazy ideas would go out of control :)

At some point, it became clear to all of us that Technical Computing was not going to be a thing at Microsoft. When I was recruited into the Windows team (hi, Mahesh!) coincidentally to own COM and to build the new WinRT programming model on top of it, I had to break the news to Euan over the holidays. I needed to do it in person, so I remember driving over to his old house (he wound up moving across the street to his new house believe it or not) to break the news. As you might expect, Euan was gracious and helpful during the transition as that was going to become my first experience with being a manager at Microsoft.

We were never too far away from each other over the years. I remember random conversations with him that included:

  • How to talk to the police if you happen to have shot an intruder in your house
  • Why driving to work in a scooter while wearing shorts was the only way to travel in December in Redmond
  • Why having an Xbox in every room in your house as a media extender was the only way to live
  • Why Thinkpads are the ONE TRUE LAPTOP

We re-united on the same (larger) team when I joined the Python team after my stint in Windows. It was great having him just upstairs from me where I could drop in to get his sage advice on many things as I sought to be helpful to my new team. His work then was deeply influential on some things that I'm working on now. When we ship, it's going to be in no small part due to our many chats over the years. I'm proud to be able to carry forward his work and I'll continue to think of him often as we continue on his journey.

Godspeed, Euan.

PS others have shared their memories of Euan

More on the metaverse

So I was procrastinating today after discovering Tim Urban on Twitter (how is it possible that I did not know he existed?):

I went down the rabbit hole and discovered his TED Talk:

And his incredibly funny writeup of what it was like to write his TED Talk.

I was watching it on my phone while sitting on my couch next to my dog who loves to sleep on my couch. Which got me thinking about what this experience would be like if I were watching this in VR. It struck me then, in the middle of watching a talk on procrastination, that at limits of technology, it wouldn't look any different at all.

That's a problem. It also got me thinking about how early movies looked no different than watching a play at a theater ... they haven't invented editing yet! So they simply recorded what it was like to watch a play, which made it possible for people who were not in the room at the time that the play was showing to experience the play, albeit in a somewhat degraded experience.

My experience in Horizon Meeting Rooms yesterday was one where it feels like those early movies - we are literally putting people into a 3D room to experience what it would be like to be in the same place physically. That to me feels like those early movies - we made it possible for people to experience a play without being in the same space physically. But why should we be constrained by the limitations of reality? What is the equivalent of editing in the metaverse? I don't have any answers here, but at least I have figured out a question and that's a start.

(An aside) I viscerally remember this thought popping into my head while I was watching Tim's talk. I think this is another great example of Beer Mode thinking which I think is extremely important for creative professionals. Random inputs can stimulate thinking in different and unexpected ways, especially with ideas that are tumbling around in your subconcious. Or at least that's how I'm justifying procrastination.

Spatial context and VR

Here's a sequence of photos that I took today at the Microsoft Library:

Bookshelves Bookshelf Shelf Book Summary Table of Contents Detail Table of Contents Page Page Detail

They represent zooming in on a piece of information. I find that my memory works this way - I can remember where something is located spatially, often times even in cases where I cannot remember the information itself.

Today, VR-based solutions like Horizon Workrooms create meeting spaces that mimic existing meeting rooms. Since I haven't yet had meetings inside of a Workroom yet, I can't comment on the experience first-hand. But watching some YouTube videos it seems like they're striving to create an experience that mimics a real-world meeting room, which is great start and requires a tremendous amount of engineering effort to get right.

But let's imagine that we're not limited by what we can do in a physical meeting room. What if we could take advantage of our spatial memory and leave objects around the room? What if those objects could be zoomed into "infinitely" to reveal more about what they represent? What if we could version control the state of the objects in the room?

One of the great things about being messy, like these famous desks is that they respresent saved context in a physical space. I would like to imagine that those piles represent trains of thought and can help bring them back into the context (or at least remind them of the context!)

Albert Einstein's desk Tony Hsieh's desk Steve Jobs desk

Modern information is mainly accessed in one of two ways: by search or through storing files in a file system. The problem with both of these approaches is that neither of them take advantage of our natural spatial awareness or our ability to organize information into hierarchies (you're already good at this

  • you can probably tell me exactly where in your house your toothbrush, your keys, or the milk is located).

What's worse is that they also remove any cues you may have that the information exists at all. What would be great is if we could have a system where you could arrange the information spatially, and that spatial information could be version controlled. This way the context is maintained.

Shared meeting room spaces are reset for each meeting. However, what if we can always recreate (or even version control) different meeting contexts? If I met with you last week, when we re-enter the meeting room everything is exactly where we left it from our last visit. This way we can take advantage of our spatial memory to organize information in a virtual space instead of relying on search or other cumbersome window-based ways of retrieving information. We could also integrate this with search "find my car keys in the house", which would reveal the item and its location.

This way we can transcend what is possible spatially and not create a world that is merely a lower-fidelity imitation of the real world. I get that the primary benefit that early adopters get from Workrooms is going to be allowing people that are distant to collaborate in a more meaninful way, much like those early films allowed people not in the room to experience the play. That's already super-valuable, especially in today's world. But perhaps spatial context could be akin to editing in films and create an entirely new medium in which we can collaborate?