Feed


  1. Using the Page Visibility API

    (developer.mozilla.org)

    This post takes a look at what page visibility is, how you can use the Page Visibility API in your applications, and describes pitfalls to avoid if you build features around this functionality.

  2. Browsertech Digest: SF&LA events, Kyle Barron interview, Arrow/Parquet

    (digest.browsertech.com)

    Hey folks, welcome to the digest!

    Events

    I'll be in SF & LA in a few weeks and hosting some Browsertech events. On May 22 in SF we're co-hosting a happy hour with Krea. Then on May 23 we're hosting the first Browsertech LA. If you're in the SF or LA area, I hope to see you there!

    Kyle Barron interview

    On the podcast, I talked with Kyle Barron of Development Seed about his work in open-source geospatial tooling in the browser. Kyle is a prolific developer who has worked on bindings for Apache Arrow, Parquet, GeoArrow, and Deck.gl, culminating in Lonboard.

    We talked about why he needed to look beyond GeoJSON and Leaflet to scale to millions of points, where Arrow and Parquet fit into the picture, and optimizing the path of data from a server-side Python DataFrame to the GPU.

    The ubiquity of Arrow / Parquet

    Something that stood out as I was editing this episode is that the sibling Apache projects Parquet and Arrow keep coming up on the podcast.

    We get into them on the episode with Kyle, but the tl;dr (tl;dl?) is that they are both data formats for generic, typed tabular data. Parquet optimizes for space, so is often used on disk and on the wire. Arrow is an in-memory format that optimizes for fast random access. They are an almost equivalent data model, so a common approach is to read Parquet from disk/network directly into Arrow in memory.

    Episode 2 guests Andrew and Eric also talked about how they use Arrow in Prospective. In episode 3, Ben Schmidt of Nomic talked about his use of Arrow as well.

    A common theme across these episodes is eliminating transformations from data as it travels through the system. A naive approach to browser-based apps is to go from an in-memory representation on the server, to JSON on the wire, to being parsed into in-memory JavaScript objects, and then the relevant attributes are packed into attribute buffers on the GPU (often with one draw call for each object).

    To be clear, this works for a lot of things! If you just need to annotate a map with a few hundred points, you won't notice the overhead.

    At scale, though, each of those transformations takes linear time, so as your data grows, so does the compute time. The Arrow approach of using a singular data representation (with Parquet as a cheap compression step across the wire) means apps can work with orders of magnitude more data without sacrificing responsiveness.

    Until next time,

    Paul

  3. It's always TCP_NODELAY. Every damn time.

    (brooker.co.za)

    It’s always TCP_NODELAY. Every damn time.

    It's not the 1980s anymore, thankfully.

    The first thing I check when debugging latency issues in distributed systems is whether TCP_NODELAY is enabled. And it’s not just me. Every distributed system builder I know has lost hours to latency issues quickly fixed by enabling this simple socket option, suggesting that the default behavior is wrong, and perhaps that the whole concept is outmoded.

    First, let’s be clear about what we’re talking about. There’s no better source than John Nagle’s RFC896 from 19841. First, the problem statement:

    There is a special problem associated with small packets. When TCP is used for the transmission of single-character messages originating at a keyboard, the typical result is that 41 byte packets (one byte of data, 40 bytes of header) are transmitted for each byte of useful data. This 4000% overhead is annoying but tolerable on lightly loaded networks.

    In short, Nagle was interested in better amortizing the cost of TCP headers, to get better throughput out of the network. Up to 40x better throughput! These tiny packets had two main causes: human-interactive applications like shells, where folks were typing a byte at a time, and poorly implemented programs that dribbled messages out to the kernel through many write calls. Nagle’s proposal for fixing this was simple and smart:

    A simple and elegant solution has been discovered.

    The solution is to inhibit the sending of new TCP segments when new outgoing data arrives from the user if any previously transmitted data on the connection remains unacknowledged.

    When many people talk about Nagle’s algorithm, they talk about timers, but RFC896 doesn’t use any kind of timer other than the round-trip time on the network.

    Nagle’s Algorithm and Delayed Acks

    Nagle’s nice, clean, proposal interacted poorly with another TCP feature: delayed ACK. The idea behind delayed ACK is to delay sending the acknowledgement of a packet at least until there’s some data to send back (e.g. a telnet session echoing back the user’s typing), or until a timer expires. RFC813 from 1982 is that first that seems to propose delaying ACKs:

    The receiver of data will refrain from sending an acknowledgement under certain circumstances, in which case it must set a timer which will cause the acknowledgement to be sent later. However, the receiver should do this only where it is a reasonable guess that some other event will intervene and prevent the necessity of the timer interrupt.

    which is then formalized further in RFC1122 from 1989. The interaction between these two features causes a problem: Nagle’s algorithm is blocking sending more data until an ACK is received, but delayed ack is delaying that ack until a response is ready. Great for keeping packets full, not so great for latency-sensitive pipelined applications.

    This is a point Nagle has made himself several times. For example in this Hacker News comment:

    That still irks me. The real problem is not tinygram prevention. It’s ACK delays, and that stupid fixed timer. They both went into TCP around the same time, but independently. I did tinygram prevention (the Nagle algorithm) and Berkeley did delayed ACKs, both in the early 1980s. The combination of the two is awful.

    As systems builders this is should be a familiar situation: two reasonable features of the system that interact to create an undesirable behavior. This kind of interaction is one of the things that makes protocol design so hard.

    Is Nagle blameless?

    Unfortunately, it’s not just delayed ACK2. Even without delayed ack and that stupid fixed timer, the behavior of Nagle’s algorithm probably isn’t what we want in distributed systems. A single in-datacenter RTT is typically around 500μs, then a couple of milliseconds between datacenters in the same region, and up to hundreds of milliseconds going around the globe. Given the vast amount of work a modern server can do in even a few hundred microseconds, delaying sending data for even one RTT isn’t clearly a win.

    To make a clearer case, let’s turn back to the justification behind Nagle’s algorithm: amortizing the cost of headers and avoiding that 40x overhead on single-byte packets. But does anybody send single byte packets anymore? Most distributed databases and systems don’t. Partially that’s because they simply have more to say, partially its because of additional overhead of protocols like TLS, and partially its because of encoding and serialization overhead. But mostly, they have more to say.

    The core concern of not sending tiny messages is still a very real one, but we’ve very effectively pushed that into the application layer. Sending a byte at a time wrapped in JSON isn’t going to be very efficient, no matter what Nagle’s algorithm does.

    Is Nagle needed?

    First, the uncontroversial take: if you’re building a latency-sensitive distributed system running on modern datacenter-class hardware, enable TCP_NODELAY (disable Nagle’s algorithm) without worries. You don’t need to feel bad. It’s not a sin. It’s OK. Just go ahead.

    More controversially, I suspect that Nagle’s algorithm just isn’t needed on modern systems, given the traffic and application mix, and the capabilities of the hardware we have today. In other words, TCP_NODELAY should be the default. That’s going to make some “write every byte” code slower than it would otherwise be, but those applications should be fixed anyway if we care about efficiency.

    Footnotes

    1. I won’t got into it here, but RFC896 is also one of the earliest statements I can find of metastable behavior in computer networks. In it, Nagle says: “This condition is stable. Once the saturation point has been reached, if the algorithm for selecting packets to be dropped is fair, the network will continue to operate in a degraded condition.”
    2. As this has gone around the internet, a number of folks have asked about TCP_QUICKACK. I don’t tend to reach for it for a few reasons, including lack of portability, and weird semantics (seriously, read the man page). The bigger problem is that TCP_QUICKACK doesn’t fix the fundamental problem of the kernel hanging on to data longer than my program wants it to. When I say write(), I mean write().

  4. Podcast appearance: The Debrief from Incident.io

    (blog.danslimmon.com)

    I’m so grateful to Incident.io for the opportunity to shout from their rooftop about Clinical troubleshooting, which I firmly believe is the way we should all be diagnosing system failures. Enjoy the full episode!

  5. Looking at People

    (chriscoyier.net)

    At one time I got interested in the “eye contact”, or lack of it, on video calls. I was going to present at an online conference, and I wanted people watching to have it look like I was looking right at them. Plus, possibly even more importantly, one-on-one calls with people. I ended up buying […]

  6. Things I found interesting in Bonnie Nardi’s “A Small Matter of Programming”

    (petemillspaugh.com)

    I read A Small Matter of Programming by Bonnie Nardi with some others in the Future of Coding book club.

    It’s amazing how current this book reads given it was published over 30 years ago in 1993. That was only two years after the world’s first website and two years before JavaScript was invented.

    On what end users care about (xii, 5)

    [End users] have little or no training in computer science, and often little interest in computers per se. Most end users regard computers as useful—even indispensable—machines capable of helping them work more productively, creatively, and with greater pleasure. This is not the same as finding computers themselves intrinsically interesting—as a computer scientist or engineer would—and it dictates an entirely different set of tools for end users.

    This is a timeless reminder from Nardi that users don’t care about our tech stack. They don’t care that I’m using Next instead of Remix, or Svelte or Astro or whatever. The thing that always sticks in my mind is Jerome Hardaway saying that users only care if your website (1) works, (2) is fast, and (3) looks good. In that order, too.

    A user of a software system was heard to remark, "I didn’t have much time, so I did it the long way." ... The converse of the statement, putatively from a collection of MIT graffiti, goes: “I would rather write programs to help me write programs than write programs.”

    End users like computers to get work done. Programmers like computers because programming is fun and interesting.

    I feel this when showing websites to friends and family. The little touches and polish that satisfy me often go unnoticed by others. The way I navigate websites can be quite different from the way someone who doesn’t create websites for a living navigates websites.

    That’s not to say end users aren’t sophisticated, though.

    End users are not "casual", "novice" or "naive" users; they are people such as chemists, librarians, teachers, architects, and accountants, who have computational needs and want to make serious use of computers, but who are not interested in becoming professional programmers.

    Nardi nicely tees up the main point of the book with this sentence. My paraphrased tl;dr of that main point is: end-user software should be easy enough for a user to do something useful within a couple hours of learning, but advanced enough that a user can become an expert over time, without limitation.

    That quotation about end users’ Serious Use of Computers is the most inspiring tidbit of the whole book for me. If you the programmer build powerful tools for your end users, they will build really impressive things. We as programmers have a high leverage opportunity to arm lots of smart, motivated people with the tools that will allow them to build sophisticated programs. Observable’s collection of examples is a powerful illustration of this concept.

    On design (3, 6, 132)

    I like the idea of designing and writing software for one user, to start. But if your goal is for others to use your software, inevitably you’ll run into the trap that your wishes won’t perfectly match those of your users. Nardi’s vision of end-user programming accounts for emergent and surprising use cases of your software by offering flexibility to users.

    On anticipating those user wishes, Nardi writes:

    As has been shown time and again, no matter how much designers and programmers try to anticipate and provide for what users will need, the effort always falls short because it is impossible to know in advance what may be needed.

    And a few pages later, my favorite quotation from the entire book, hands down:

    Of course design still is, and almost certainly always will be, a black art whose most crucial elements remain an incalculable mix of imagination, intuition, and intellectual interaction with one’s fellows.

    This really resonates. You have to hold the thing in your hands, collaborate with teammates, and talk to users to find the right design.

    More on that toward the end of the book:

    The chief point is that application development merges with user interface design: one does not build "the application" and then tack on a user interface; rather, the two processes are more closely interwoven, and the user interface, by being subsumed in the application development process, comes to be a critical aspect of organizing and presenting application semantics—not an afterthought.

    On natural language-based human computer communication (10, 15, 83)

    We begin, in chapter 2, by asking why we need end user programming systems at all. Aren’t natural language-based systems that will allow us to just tell the computer what we want right around the corner? We...argue that conversation is not the best model for human-computer communication.

    If by right around the corner Nardi—or really the crowd she’s speaking on behalf of—meant November 2022, then yes natural language conversation with computers was just around the corner in 1993. This section of the book reads as the AI chapter and Nardi as the AI skeptic.

    I think she nails the idea that language models enable a programmer to iterate more quickly but don’t (yet) take the programmer out of the loop:

    While one can conceive of a system that could automatically generate programs for end users, the probability of such a system producing right-the-first-time programs appears to be low. There may well be a role for high-level programming languages and environments that enable users to specify programs, evaluate them, and iterate toward desirable ones. These will probably not deliver the miracle of "automatic programming" that one might hope for, however.

    I like the label right-the-first-time programs. And even with powerful language models now available for workflows like generating code, brainstorming, and researching, a lot of work is left to be done for computer agents to act on on our natural language wishes. Nardi discusses context as the missing piece.

    The computer fails to understand and to speak because much of what is understood in a conversation attaches to the context in which the participants of a conversation find themselves, and of which they have knowledge.

    She provides the example of asking a computer to distribute the latest draft to the group and let [me] know when they’ve read it. That simple-for-a-human command is full of context and would require complex logic for an agent to complete.

    On alternatives to natural language (24)

    Nardi also makes a compelling case that, depending on the activity, we already use many forms of communication better suited to their given task than natural language. Driving a car is a great (and funny) example. A steering wheel is way better suited to the task of driving than talking to your car would be.

    This reminds me of something I read recently in Make It Stick—that there is no empirical support for the claim that instruction in one’s preferred learning style improves learning outcomes. For example, we often hear people (myself included) say "I am a visual learner," but as yet there’s no evidence to back it up. From the book:

    The popular notion that you learn better when you receive instruction in a form consistent with your preferred learning style, for example as an auditory or visual learner, is not supported by the empirical research...It is more important that the mode of instruction match the nature of the subject being taught: visual instruction for geometry and geography, verbal instruction for poetry, and so on.

    Nardi hits on that point:

    For many problems, a graphical representation is much the most "natural"...Again, the problem is one of matching the practical problem at hand to the design of the technology, rather than assuming a priori the primacy of one form of communication, i.e. conversation.

    It’s the subject matter that dictates the right medium of communication.

    On baseball (28, 30-33, 85)

    Nardi argues that formal languages and notations are widely used by people in all walks of life. My favorite example she goes into is baseball, which is full of specialized communication: catchers signaling to pitchers, base coaches signaling to runners and batters, fans and statisticians keeping scorecards, etc. She quotes from Fred Schwed, Jr. in Baseball by the Rules:

    The baseball box score is the pithiest form of written communication in America today. It is two or three hours...of complex activity, virtually inscribed on the head of a pin.

    Baseball score keeping is a particularly fascinating example because it’s common for fans to fill out scorecards while attending a game, purely for fun. My grandfather actually used to do it when I went to Baltimore Orioles games with him. Fans are motivated to learn a fairly complex set of rules and notation to fill out scorecards, without any requirement or monetary incentive to do so.

    On knitting (34)

    Nardi follows her explanation of baseball notation with a similar exploration of knitting notation. This back-to-back sequence produced an interesting effect for me as the reader.

    As a casual baseball fan who hasn’t ever kept scorecards, it was fun and engaging to read and think about baseball score keeping and notation. But as a knitting novice, reading knitting shorthand produced something similar to the eyes-glaze-over effect you’d see when showing code to a non-coder or math formulas to a non-mathematician. Knitting instructions are intimidating to me because of my lack of familiarity with the domain whereas baseball scorecards are interesting like a puzzle or game.

    On end user programming as good fun (35)

    As with baseball scorecards or knitting, end user programs should be fun and satisfying.

    In addition to being useful, formal systems have an appeal all their own; people naturally gravitate to them. Games are perhaps the clearest example of the way we spontaneously engage in interaction with formal systems.

    Board games are another fantastic example. Playing Catan requires learning a non-trivial set of rules, for example.

    But it’s not just leisurely activities that can be fun—or maybe satisfying is a better word for it. When I worked as an investment banker, I found great satisfaction in working on spreadsheets. Nardi makes the point that only when people have a particular interest or motivation in something—be it baseball, knitting, or forecasting financials—will they learn a formal language to accomplish their goal.

    And familiarity with a domain doesn’t just lead to fun or satisfaction in learning and using a formal notation. It also unlocks a better ability and confidence to learn that formal skill. Nardi cites studies that show participants' math and logic skills are much better when problems are framed in things they are familiar with—like cars—than when framed abstractly as pure math.

    On UI density (65, 85-87)

    One major challenge with visual programming and visual end user programs is information density in the UI. Text often wins out because its density can be hard to match, but there are counterexamples, like spreadsheets organizing computation in a grid. The book argues that a hybrid approach with text and graphics mixed together—like in spreadsheets—makes the most sense, and that it’s unrealistic to disregard text.

    Sometimes a few words are worth a thousand pictures.

    Nardi notes that spreadsheet users have a strong preference for seeing as much data as possible without scrolling. I like to view large code files and long documents on a vertical monitor for the same reason.

    On spreadsheets

    The more Nardi discusses spreadsheets, the more in awe I am of their elegance, which I did not fully appreciate when I used to spend ~12-16 hours per day working in Excel. I am always impressed by UIs that balance density and polish. But what I think is most elegant about spreadsheets is that they are simple enough to be useful to a beginner within the first hour of use and powerful enough for experts to perform sophisticated tasks over hundreds of hours.

    Nardi describes this concept as layers:

    ...the system must incorporate the advanced features that make sophisticated customizations and extensions possible in the first place. But...the system should also be modular...thought of as being in layers, such that an end user may access a first layer that is easy to use and learn, while succeeding layers house modules for more advanced use.

    Spreadsheets are not without shortcomings, of course. For example, I am floored in retrospect by how bad our version control system was in Excel at my old job. We basically just versioned up the file whenever it felt like a good breaking point, so you’d end up with files like shopify_v346.xlsx. Even worse, two people couldn’t reasonably work on the same model simultaneously without git-style branching available. Version control with git was one of my happiest discoveries when I started coding.

    On associating information with its physical location (87)

    Nardi includes conversations from ethnographic studies of spreadsheet users who talk about knowing where to find stuff in their models. One user knows that if she’s in the middle of her spreadsheet around the Municipal Bonds section then the Preferred Stocks section must be above that.

    Remembering where some information is in a spreadsheet has a similar feeling for me to remembering where I read something in a physical book. I can remember that some piece of information was in a smaller paragraph around the bottom of a page on the left of the book, for example, then scan through the book until I find it. I don’t often experience that kind of feel with a codebase.

    On modern end-user programming applications and research

    Nardi includes something of a call for research in the beginning of the book:

    One hope for this book is that it will stimulate further empirical research on existing end user software systems so that we can learn more about what works and what doesn’t work.

    I am new to this field, so I’d appreciate any recommendations on papers to read or examples to check out.

    For the research side, I’d like to spend some time using Elicit as a tool to survey the end-user programming research in the last 30 years. So far, other than this book I’ve read the Ink & Switch essay on end-user programming and Steve Krouse’s essay on end-programmer programming, and I’ve listened to the Future of Coding podcast episode about this book (discussion of the book starts around the 20m mark).

    As for modern examples of end-user programming in the wild, Observable and Notion are two programs I’ve actually used that come to mind. Then there are apps I haven’t used like Zapier, iOS custom shortcuts, and website builders (e.g. Bubble, Webflow).

    Steve—in collaboration with Ellen Chisa and Paul Biggar when they were working on Dark—created The Whole Code Catalog. That is a great list, but it is now 5 years old, and it includes some more programmer-oriented tools like Glitch and Zeit (Vercel) that I don’t think of as end-user tools (although I know the lines are blurry). I’m curious to learn about the best modern examples of end-user programs, so feel free to reply below or on Twitter/wherever if you have examples to share.

  7. Programming mantras are proverbs

    (lukeplant.me.uk)

    Proverbs are supposed to encapsulate a bit of wisdom, but you still need know when to apply it.

  8. Paradigms succeed when you can strip them for parts

    (buttondown.email)

    I'm speaking at DDD Europe about Empirical Software Engineering!1 I have complicated thoughts about ESE and foolishly decided to update my talk to cover studies on DDD, so I'm going to be spending a lot of time doing research. Newsletters for the next few weeks may be light.

    The other day I was catching myself up on the recent ABC conjecture drama (if you know you know) and got reminded of this old comment:

    The methods in the papers in question can be used relatively quickly to obtain new non-trivial results of interest (or even a new proof of an existing non-trivial result) in an existing field. In the case of Perelman’s work, already by the fifth page of the first paper Perelman had a novel interpretation of Ricci flow as a gradient flow which looked very promising, and by the seventh page he had used this interpretation to establish a “no breathers” theorem for the Ricci flow that, while being far short of what was needed to finish off the Poincare conjecture, was already a new and interesting result, and I think was one of the reasons why experts in the field were immediately convinced that there was lots of good stuff in these papers. — Terence Tao

    "Perelman's work" was proving the Poincaré conjecture. His sketch was 68 pages of extremely dense math. Tao's heuristic told him it was worth working through is that it had interesting ideas that were useful to other mathematicians, even if the final paper was potentially wrong.

    I use this idea a lot in thinking about software: the most interesting paradigms are ones you can "scavenge" ideas from. Even if the whole paradigm isn't useful to you, you can pull out some isolated ideas, throw away the rest, and still benefit from it. These paradigms are the ones that are most likely to spread, as opposed to paradigms that require total buyin.

    Let's take as an example functional programming (FP). ML/Haskell-style FP has a lot of interconnected ideas: higher-order functions, pattern matching, algebraic data types, immutable data structures, etc. But if you ignore all of that away and stick to writing everything in idiomatic Java/Python/C, but the only change you make is writing more pure functions, then you've already won.

    This has several positive effects on FP adoption. It immediately demonstrates that FP has good ideas and is not just some weird cult practice. It also means people can "dip their toes in" and benefit without having to go all in. Then they can judge if they're satisfied or want to go further into learning FP. And if they do go further, they can learn new ideas gradually in stages.

    By contrast, compare array programming languages (APLs). There's lot of cool ideas in APL, but they're so interconnected it's hard to scavenge anything out. If I want to use multidimensional arrays in Python I can't just whip it up myself, I have to import an external library like numpy and learn a whole bunch of new ideas at once. So most of the ideas in APL stay in APLs.3

    A related idea how techniques are only adopted if you can "half-ass" them. TDD is useful even if you only use it sometimes and phone in the "green-refactor" steps, which helped adoption. Cleanroom depends on everybody doing it properly all the time, which hindered adoption.

    Scavenging from formal methods

    Part of the reason I'm thinking about this is that business has been real slow lately and I'm thinking about how I can make formal methods (FM) more popular. What ideas can we scavenge from writing specifications?

    Obvious ones are predicate logic and the implication operator, but they're both poorly supported by existing programming languages, and I'm not sure how to best pitch them.2 Also, they're "too big", and it takes a lot of practice to wrap your mind around using quantifiers and implication.

    Sets, definitely. Sets are unordered, unique collections of objects. Lots of things we store as lists could be sets instead. I use sets way more after learning some FM.

    There's also a lot of high-level ideas in FM that usefully apply to all other code. System invariants are things that should always hold in every possible state. More broadly, we can categorize properties: "safety" is roughly "bad things don't happen", "reachability" is "good things always could happen", and "liveness" is "good things always do happen". Fairness, maybe. Refinement is definitely too in-the-weeds to be scavengable.

    I'd also call out decision tables as a particular technique that's extractable. You can teach decision tables in five minutes to a completely nontechnical person.

    As I said, short newsletters for the rest of this month. Have a great week!


    Things I'm curious about

    • Logic Programming: What ideas are scavengable from LP? I know that pattern matching originated in LP and found its way to FP from there, but that's associated with FP more now.
    • Nontechnical People: What's a good word for them? "Nondev" seems like it leaves out data scientists, DBAs, etc. "Normie" and "Muggle" are too offensive. "Layfolk" just sounds weird.
    • TLA+ workshops: Would people be interested in an "intermediate" TLA+ workshop, for people who've already written production specs but want to get better? I'm thinking half a day, 6-8 people, with people bringing up what they want to learn about in advance. If there's enough people who'd want this I can plan something out.

    1. Domain-Driven Design. 

    2. This is one big reason my attempt to write a book on predicate logic has been in a long stall. 

    3. One idea that mathematicians scavenged from APLs is the Iverson bracket: [P] = P ? 1 : 0

  9. Testing with Python (part 4): why and what to test?

    (bitecode.dev)

    Delaying the chores by second-guessing all your decisions

  10. The World Record for Loneliness

    (blog.danslimmon.com)

    What's the farthest any person has been from the nearest other person?

  11. Backup Game Boy ROMs and saves on Ubuntu

    (sethmlarson.dev)

    Backup Game Boy ROMs and saves on Ubuntu

    AboutBlogNewsletterLinks

    Backup Game Boy ROMs and saves on Ubuntu

    Published 2024-05-06 by Seth Larson
    Reading time: minutes

    I'm a big fan of retro video games, specifically the Game Boy Color, Advance, and GameCube collections. The physicality of cartridges, link cables, and accessories before the internet was widely available for gaming has a special place in my heart.

    With the recent changes to the App Store to allow emulators (and judging by the influx of issues opened on the Delta emulator GitHub repo) there is a growing interest in playing these games on mobile devices.

    So if you're using Ubuntu like me, how can you backup your ROMs and saves?


    Using GB Operator with my copy of Pokémon FireRed

    What you'll need

    To backup data from Game Boy cartridges I used the following software and hardware:

    • Game Boy, GBC, or GBA cartridge
    • GB Operator from Epilogue ($50 USD + shipping)
    • Playback software from Epilogue
    • Epilogue includes a USB-C to USB-A connector, so an adapter may be needed

    There are other options for backing up Game Boy ROMs and saves, some of which are less expensive than the GB Operator, but I went with the GB Operator because it explicitly listed Linux support and from watching reviews appeared to provide a streamlined experience.

    Getting started with GB Operator and Playback

    Download the Playback AppImage for Linux and the CPU architecture you have. Make the AppImage file executable with:

    $ chmod a+x ./Playback.AppImage
    

    If you try to execute this file ($ ./Playback.AppImage) and receive this error:

    dlopen(): error loading libfuse.so.2
    
    AppImages require FUSE to run. 
    You might still be able to extract the contents of this AppImage 
    if you run it with the --appimage-extract option. 
    See https://github.com/AppImage/AppImageKit/wiki/FUSE 
    for more information
    

    You'll need to install FUSE on Ubuntu:

    $ sudo apt-get install libfuse2
    

    After this you should be able to run Playback:

    $ ./Playback.AppImage
    

    From here the application should launch, but even if you have your GB Operator plugged in there's a chance it won't be detected. There's a good chance that your current user doesn't have access to the USB device. Epilogue provides some guides on how to enable access.

    After following the above guide and logging in and out for the changes to take effect your GB Operator should be detected. Connecting a cartridge and navigating to "Data" in the menus provides you with options to "Backup Game" and "Backup Save".

    Selecting these options might trigger a crash with the following error when starting the export process:

    (Playback.AppImage:44475): Gtk-WARNING **: 15:05:20.886: Could not load a pixbuf from icon theme.
    This may indicate that pixbuf loaders or the mime database could not be found.
    **
    Gtk:ERROR:../../../../gtk/gtkiconhelper.c:494:ensure_surface_for_gicon: assertion failed (error == NULL): Failed to load /usr/share/icons/Yaru/16x16/status/image-missing.png: Unrecognized image file format (gdk-pixbuf-error-quark, 3)
    Bail out! Gtk:ERROR:../../../../gtk/gtkiconhelper.c:494:ensure_surface_for_gicon: assertion failed (error == NULL): Failed to load /usr/share/icons/Yaru/16x16/status/image-missing.png: Unrecognized image file format (gdk-pixbuf-error-quark, 3)
    Aborted (core dumped)
    

    The fix that worked for me came from a Redditor who talked with Epilogue support and received the following answer:

    LD_PRELOAD=/usr/lib/x86_64-linux-gnu/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-svg.so \
      ./Playback.AppImage
    

    Running the AppImage with the LD_PRELOAD value set fixed my issue, I've since added this shim to an alias, so I don't have to remember it. Hopefully in a future version of Playback this won't be an issue.

    From here backing up your ROMs and saves should work as expected. Happy gaming!

    Thanks for reading! ♡ Did you find this article helpful and want more content like it? Get notified of new posts by subscribing to the RSS feed or the email newsletter.


    This work is licensed under CC BY-SA 4.0

  12. sphinxcontrib-sqltable 2.1.0 - SQLAlchemy 2.0 support

    (doughellmann.com)

    What’s new in 2.1.0? update packaging to use pyproject.toml Update sqltable.py to support SQLAlchemy 2.0 (contributions by Gabriel Gaona) Pin SQLAlchemy>=2.0 to prevent backwards incompatibility with changes to the query execution interface. (contributions by Gabriel Gaona) fix up python version support list Fix the obsolete link to bitbucket (contributions by kbaikov) find the sample schema file depending on how sphinx is invoked move sample database to /tmp for rtd.org

  13. (chriscoyier.net)

    If you long for a web of yore were things were better, somehow: The thing is: none of this is gone. Nothing about the web has changed that prevents us from going back. If anything, it’s become a lot easier. We can return. Better, yet: we can restore the things we loved about the old web while incorporating […]

  14. Weekly Update 398

    (troyhunt.com)

    Presently sponsored by: Kolide believes that maintaining endpoint security shouldn’t mean compromising employee privacy. Check out our manifesto: Honest Security.

    How many different angles can you have on one data breach? Facial recognition (which probably isn't actual biometrics), gambling, offshore developers, unpaid bills, extortion, sloppy password practices and now, an arrest. On pondering it more after today's livestream, it's the unfathomable stupidity of publishing

  15. A year of publishing the MDN Blog

    (developer.mozilla.org)

    We've been writing about web development and the web platform on the MDN Blog since May 2023. Here's our highlights and top posts along with our favorites.

  16. What's up Python? Lots of Wasm, Pydantic on fire, and Guido in a crossword...

    (bitecode.dev)

    April, 2024

  17. Isolating risk in the CPython release process

    (sethmlarson.dev)

    Isolating risk in the CPython release process

    AboutBlogNewsletterLinks

    Isolating risk in the CPython release process

    Published 2024-05-02 by Seth Larson
    Reading time: minutes

    This critical role would not be possible without funding from the Alpha-Omega project. Massive thank-you to Alpha-Omega for investing in the security of the Python ecosystem!

    The first stage of the CPython release process produces source and docs artifacts. In terms of "supply chain integrity", the source artifacts are the most important artifact produced by this process. These tarballs are what propagates down into containers, pyenv, and operating system distributions, so reducing the risk that these artifacts are modified in-flight is critical.

    A few weeks ago I published that CPythons' release process for source and docs artifacts was moved from developers machines onto GitHub Actions, which provides an isolated build environment.

    This already reduces risk of artifacts being accidentally or maliciously modified during the release process. The layout of the build and release process before used a build script which built the software from source, built the docs, and then ran tests all in the same isolated job. This was totally fine on a developers' machine where there isn't any isolation possible between different stages.

    Build Dependencies
    Build Dependenci...
    Build Source
    Build Source
    Source Artifacts
    Source Artifa...
    Docs Artifacts
    Docs Artifacts
    Source
    Code
    Source...
    Docs Dependencies
    Docs Dependencies
    Build Dependencies
    Build Dependenci...
    Build Source
    Build Source
    Source Artifacts
    Source Artifa...
    Docs Artifacts
    Docs Artifacts
    Source
    Code
    Source...
    Docs Dependencies
    Docs Dependencies
    Build Docs
    Build Docs
    Testing
    Testing
    Build Docs
    Build Docs
    Testing
    Testing
    Source Artifacts
    Source Artifa...
    Text is not SVG - cannot display

    Before and after splitting up build stages

    With GitHub Actions we can isolate each stage from the others and remove the need to install all dependencies for all jobs into the same stage. This drastically reduces the number of dependencies, each representing a small amount of risk, for the stages that are critical for supply chain security of CPython (specifically the building of source artifacts).

    Above you can see on the left the previous process which pulls all dependencies into the same job (represented as a gray box) and the right being the new process having split up the builds and testing and the source and docs builds.

    After doing this split the "Build Source" task only needs ~170 dependencies instead of over 800 dependencies (mostly for documentation LaTeX and PDFs) and all of those dependencies either come with the operating system and thus can't be reduced further or are pinned in a lock file.

    The testing stage still has access to the source artifacts, but only after they've been uploaded to GitHub Action Artifacts and aren't able to modify them.

    I plan to write a separate in-depth article about dependencies, pinning, and related topics, stay tuned for that.

    SOSS Community Day 2024 recordings

    The recordings for my talk and the OpenSSF tabletop session have been published to YouTube:

    Embrace the Differences: Securing open source software ecosystems where they are

    In the talk I discuss the technical and also social aspects of why it's sometimes difficult to adopt security changes into an open source ecosystem. Ecosystem-agnostic work (think memory safety, provenance, reproducible builds, vulnerabilities) tends to operate at a much higher level than the individual ecosystems where the work ends up being applied.

    OpenSSF Tabletop Session

    The tabletop session had many contributors representing all the different aspects of discovering, debugging, disclosing, fixing, and patching a zero-day vulnerability in an open source component that's affecting production systems.

    Tabletop Session moderated by Dana Wang

    Mentoring for Google Summer of Code

    Google Summer of Code 2024 recently published its program and among the projects and contributors accepted was CPython's project for adopting the Hardened Compiler Options Guide for C/C++. I'm mentoring the contributor through the process of contributing to CPython and hopefully being successful in adopting hardened compiler options.

    Other items

    That's all for this week! 👋 If you're interested in more you can read last week's report.

    Thanks for reading! ♡ Did you find this article helpful and want more content like it? Get notified of new posts by subscribing to the RSS feed or the email newsletter.


    This work is licensed under CC BY-SA 4.0

  18. "Integration tests" are just vibes

    (buttondown.email)

    New blog post! Software Friction is about how all the small issues we run into developing software cause us to miss deadlines, and how we can address some of them. Patreon here.

    "Integration tests" are just vibes

    You should write more unit tests than integration tests. You should write more integration tests than unit tests. "Integrated tests" are a scam.

    But what exactly is an integration test? Is it tests that cross module boundaries? Is it tests that depend on several pieces of behavior, even in one module? Is it anything that tests "emergence"1? Anything with side effects? If get_active_users() runs a database query, are all tests against it integration tests?

    No wonder both Twitter and Google have (at one point) claimed that "integration test" wasn't a useful term. There are just too many different things people mean "integration".2

    Even if we can't define integration test, we still have recognize them. There are tests that are bigger than our "unit tests" and smaller than an end-to-end test, and people have strong opinions about writing more or less of them. We can have a sense of something being an "integration test" even if we can't strictly define it.

    I think it's more useful to think of integration tests in terms of family resemblances. There are certain measurable properties of tests, and integration tests measure relatively different than a unit test. One quality is test speed: unit tests are "fast", integration tests are "slow". This does not mean all integration tests are slower than any unit test (unit property tests can be very slow). But slow tests are more likely to be integration tests than unit tests.

    "Integration tests" are a vibe we get based on how we perceive a test's properties. Some of these properties are:

    1. The number of lines of production code executed as part of the test.

    2. Number of boundaries crossed, module/object or otherwise.

    3. Number of unit tests you can decompose the test into.

    4. Number of external services required to run the test.

    5. Amount of engineering effort to write/run the tests. Does it need separate files, a separate language, setting up lots of docker instances?

    6. Nonlocality of discovered bugs: if the test fails, how many lines you have to look at to find the problem.

    7. Likelihood a test failure is due to "emergence": things are correct in isolation but interact in a problematic way.

    The more of these have high values, the more likely a test is to be an integration test.

    Miscellaneous Thoughts

    A test that hits the database (external service) can still be a unit test if it runs quickly and covers a small amount of code. Tests that make HTTP calls are less likely to be fast, so more likely to be integration tests.

    Mocks and pact-style testing push some of these values down, by reducing boundaries crossed and lines of code executed at the cost of increased test infrastructure. Contracts and assertions increase bug locality, since the test fails the moment a pre/postcondition is broken instead of when the test completes.

    If you push the values high enough, do you get end-to-end tests instead of integration tests? Or are E2E tests fundamentally different from integration tests in some way?

    Is it possible for a unit test to uncover emergence, or is emergence a sufficient condition to say that something is definitely an integration test?


    Things I'm curious about

    Gonna make this a weekly thing maybe (or whenever I have enough new stuff to ask about).

    • Sokoban. Sokoban is "easier" than Chess or GO (PSPACE vs EXPTIME), but Sokoban AIs regularly fail at simple puzzles that humans can do (ref ex1 ex2). What makes Sokoban "so much harder" for AIs? I don't think it's just a lack of interest because I've found Google and Deepmind papers on solving sokoban (and not making much progress).
    • NFCs: I impulse-bought a bunch of NFC chips. I'm already using them to open train trackers, what else can I do with them? I'm not a mobile dev but can whip up Taskers and server code pretty quickly.

    If you know something about either of these, feel free to share! Or write about them publicly and share the link with me. You can just reply directly to this email.


    1. Geeze I can't believe I tried to make "automanual" a thing 

    2. "Unit test" is also fuzzy but we at least agree that add(2, 3) == 5 is a unit test. 

  19. Strum Machine

    (chriscoyier.net)

    I mostly play old time and bluegrass music. Particularly in old time and fiddle songs, the songs have a fairly rigid and repetitious structure. That’s not to say players don’t do interesting things with it, but unless a song is intentionally crooked, the structure of the song remains the same throughout. A typical structure is […]

  20. Humble Chronicles: Shape of the Component

    (tonsky.me)

    Looking for an ergonomic way to define components

  21. Humble Chronicles: The Inescapable Objects

    (tonsky.me)

    Why we need OOP, even in Clojure

  22. Forbidden Links

    (chriscoyier.net)

    Malcolm Coles: 10+ years ago I created an annual list of websites that FORBADE you from linking to them, DEMANDED you write to ask for permission or LIMITED links to only their home page. Royal Mail even promised to post me a paper licence. Now a decade has passed, let’s see who’s still doing it … And […]

  23. Weekly Update 397

    (troyhunt.com)

    Presently sponsored by: Kolide believes that maintaining endpoint security shouldn’t mean compromising employee privacy. Check out our manifesto: Honest Security.

    Banks. They screw us on interest rates, they screw us on fees and they screw us on passwords. Remember the old "bank grade security" adage? I took this saying to task almost a decade ago now but it seems that at least as far as password advice goes,

  24. TypeIs does what I thought TypeGuard would do in Python

    (rednafi.com)

    The handful of times I’ve reached for typing.TypeGuard in Python, I’ve always been confused by its behavior and ended up ditching it with a # type: ignore comment. For the uninitiated, TypeGuard allows you to apply custom type narrowing1. For example, let’s say you have a function named pretty_print that accepts a few different types and prints them differently onto the console: from typing import assert_never def pretty_print(val: int | float | str) -> None: if isinstance(val, int): # assert_type(val, int) print(f"Integer: {val}") elif isinstance(val, float): # assert_type(val, float) print(f"Float: {val}") elif isinstance(val, str): # assert_type(val, str) print(f"String: {val}") else: assert_never(val) If you run it through mypy, in each branch, the type checker automatically narrows the type and knows exactly what the type of val is.

  25. Feedbin Email Newsletter… Emails

    (chriscoyier.net)

    Feedbin has custom email addresses now so you can use a unique email address for each newsletter and not worry about a spam leak.

  26. Testing with Python (part 3): pytest setup

    (bitecode.dev)

    Testing your patience

  27. Podcast: Компьютеры не особо рассчитаны на людей @ Думаем дальше

    (tonsky.me)

    Илья рассказывает, как мы просрали многозадачность, а Никита ругает консольный интерфейс Гита.

  28. MemoryDB: Speed, Durability, and Composition.

    (brooker.co.za)

    MemoryDB: Speed, Durability, and Composition.

    Blocks are fun.

    Earlier this week, my colleagues Yacine Taleb, Kevin McGehee, Nan Yan, Shawn Wang, Stefan Mueller, and Allen Samuels published Amazon MemoryDB: A fast and durable memory-first cloud database1. I’m excited about this paper, both because its a very cool system, and because it gives us an opportunity to talk about the power of composition in distributed systems, and about the power of distributed systems in general.

    But first, what is MemoryDB?

    Amazon MemoryDB for Redis is a durable database with microsecond reads, low single-digit millisecond writes, scalability, and enterprise security. MemoryDB delivers 99.99% availability and near instantaneous recovery without any data loss.

    or, from the paper:

    We describe how, using this architecture, we are able to remain fully compatible with Redis, while providing single-digit millisecond write and microsecond-scale read latencies, strong consistency, and high availability.

    This is remarkable: MemoryDB keeps compatibility with an existing in-memory data store, adds multi-AZ (multi-datacenter) durability, adds high availability, and adds strong consistency on failover, while still improving read performance and with fairly little cost to write performance.

    How does that work? As usual, there’s a lot of important details, but the basic idea is composing the in-memory store (Redis) with our existing fast, multi-AZ transaction journal2 service (a system we use in many places inside AWS).

    Composition

    What’s particularly interesting about this architecture is that the journal service doesn’t only provide durability. Instead, it provides multiple different benefits:

    • durability (by synchronously replicating writes onto storage in multiple AZs),
    • fan-out (by being the replication stream replicas can consume),
    • leader election (by having strongly-consistent fencing APIs that make it easy to ensure there’s a single leader per shard),
    • safety during reconfiguration and resharding (using those same fencing APIs), and
    • the ability to move bulk data tasks like snapshotting off the latency-sensitive leader boxes.

    Moving these concerns into the Journal greatly simplifies the job of the leader, and minimized the amount that the team needed to modify Redis. In turn, this makes keeping up with new Redis (or Valkey) developments much easier. From an organizational perspective, it also allows the team that owns Journal to really focus on performance, safety, and cost of the journal without having to worry about the complexities of offering a rich API to customers. Each investment in performance means better performance for a number of AWS services, and similarly for cost, and investments in formal methods, and so on. As an engineer, and engineering leader, I’m always on the look out for these leverage opportunities.

    Of course, the idea of breaking systems down into pieces separated by interfaces isn’t new. It’s one of the most venerable ideas in computing. Still, this is a great reminder of how composition can reduce overall system complexity. The journal service is a relatively (conceptually) simple system, presenting a simple API. But, by carefully designing that API with affordances like fencing (more on that later), it can remove the need to have complex things like consensus implementations inside its clients (see Section 2.2 of the paper for a great discussion of some of this complexity).

    As Andy Jassy says:

    Primitives, done well, rapidly accelerate builders’ ability to innovate.

    Distribution

    It’s well known that distributed systems can improve durability (by making multiple copies of data on multiple machines), availability (by allowing another machine to take over if one fails), integrity (by allowing machines with potentially corrupted data to drop out), and scalability (by allowing multiple machines to do work). However, it’s often incorrectly assumed that this value comes at the cost of complexity and performance. This paper is a great reminder that assumption is not true.

    Let’s zoom in on one aspect of performance: consistent latency while taking snapshots. MemoryDB moves snapshotting off the database nodes themselves, and into a separate service dedicated to maintaining snapshots.

    This snapshotting service doesn’t really care about latency (at least not the sub-millisecond read latencies that the database nodes worry about). It’s a throughput-optimized operation, where we want to stream tons of data in the most throughput-efficient way possible. By moving it into a different service, we get to avoid having throughput-optimized and latency-optimized processes running at the same time (with all the cache and scheduling issues that come with that). The system also gets to avoid some implementation complexities of snapshotting in-place. From the paper, talking about the on-box BGSave snapshotting mechanism:

    However, there is a spike on P100 latency reaching up to 67 milliseconds for request response times. This is due to the fork system call which clones the entire memory page table. Based on our internal measurement, this process takes about 12ms per GB of memory.

    and things get worse if there’s not enough memory for the copy-on-write (CoW) copy of the data:

    Once the instance exhausts all the DRAM capacity and starts to use swap to page out memory pages, the latency increases and the throughput drops significantly. […] The tail latency increases over a second and throughput drops close to 0…

    the conclusion being that to avoid this effect database nodes need to keep extra RAM around (up to double) just to support snapshotting. An expensive proposition in an in-memory database! Moving snapshotting off-box avoids this cost: memory can be shared between snapshotting tasks, which significantly improves utilization of that memory.

    The upshot is that, in MemoryDB with off-box snapshotting, performance impact is entirely avoided. Distributed systems can optimize components for the kind of work they do, and can use multi-tenancy to reduce costs.

    Conclusion

    Go check out the MemoryDB team’s paper. There’s a lot of great content in there, including a smart way to ensure consistency between the leader and the log, a description of the formal methods the team used, and operational concerns around version upgrades. This is what real system building looks like.

    Bonus: Fencing

    Above, I mentioned how fencing in the journal service API is something that makes the service much more powerful, and a better building block for real-world distributed systems. To understand what I mean, let’s consider a journal service (a simple ordered stream service) with the following API:

    write(payload) -> seq
    read() -> (payload, seq) or none
    

    You call write, and when the payload has been durably replicated it returns a totally-ordered sequence number for your write. That’s powerful enough, but in most systems would require an additional leader election to ensure that the writes being sent make some logical sense.

    We can extend the API to avoid this case:

    write(payload, last_seq) -> seq
    read() -> (payload, seq) or none
    

    In this version, writers can ensure they are up-to-date with all reads before doing a write, and make sure they’re not racing with another writer. That’s sufficient to ensure consistency, but isn’t particularly efficient (multiple leaders could always be racing), and doesn’t allow a leader to offer consistent operations that don’t call write (like the in-memory reads the MemoryDB offers). It also makes pipelining difficult (unless the leader can make an assumption about the density of the sequences). An alternative design is to offer a lease service:

    try_take_lease() -> (uuid, deadline)
    renew_lease(uuid) -> deadline
    write(payload) -> seq
    read() -> (payload, seq) or none
    

    A leader who believes they hold the lease (i.e. their current time is comfortably before the deadline) can assume they’re the only leader, and can go back to using the original write API. If they end up taking the lease, they poll read until the stream is empty, and then can take over as the single leader. This approach offers strong consistency, but only if leaders absolutely obey their contract that they don’t call write unless they hold the lease.

    That’s easily said, but harder to do. For example, consider the following code:

    if current_time < deadline:
      <gc or scheduler pause>
      write(payload)
    

    Those kinds of pauses are really hard to avoid. They come from GC, from page faults, from swapping, from memory pressure, from scheduling, from background tasks, and many many other things. And that’s not even to mention the possible causes of error on local_time. We can avoid this issue with a small adaptation to our API:

    try_take_lease() -> (uuid, deadline)
    renew_lease(uuid) -> deadline
    write(payload, lease_holder_uuid) -> seq
    read() -> (payload, seq) or none
    

    If write can enforce that the writer is the current lease holder, we can avoid all of these races while still allowing writers to pipeline things as deeply as they like. This still-simple API provides an extremely powerful building block for building systems like MemoryDB.

    Finally, we may not need to compose our lease service with the journal service, because we may want to use other leader election mechanisms. We can avoid that by offering a relatively simple compare-and-set in the journal API:

    set_leader_uuid(new_uuid, old_uuid) -> old_uuid
    write(payload, leader_uuid) -> seq
    read() -> (payload, seq) or none
    

    Now we have a super powerful composable primitive that can offer both safety to writers, and liveness if the leader election system is reasonably well behaved.

    Footnotes

    1. To appear at SIGMOD’24.
    2. The paper calls it a log service, which is technically correct, but a term I tend to avoid because its easily confused with logging in the observability sense.

  29. Open Source Summit North America 2024

    (sethmlarson.dev)

    Open Source Summit North America 2024

    AboutBlogNewsletterLinks

    Open Source Summit North America 2024

    Published 2024-04-24 by Seth Larson
    Reading time: minutes

    This critical role would not be possible without funding from the Alpha-Omega project. Massive thank-you to Alpha-Omega for investing in the security of the Python ecosystem!

    Last week I attended SOSS Community Day and OSS Summit. It was great to catch up with friends and to meet new people for the first time at a cross-ecosystem open source event.

    I gave a talk "Embrace the Differences: Securing software ecosystems where they are" which funnily enough had a complementary talk about the ways software repositories can collaborate for security.

    My talk focused on how security standards and tools typically want to operate across software ecosystems and differences in standards, tools, maintainers, and user expectations between ecosystems can make that difficult.

    You can download my slides and the recording will be available eventually on YouTube.

    OpenSSF Tabletop Session

    I also participated in the first OpenSSF Tabletop Session organized and hosted by Dana Wang. I played the role of "open source maintainer" and represented how an exploited zero-day vulnerability would appear from the perspective of an open source project.

    I emphasized the realities of vulnerability disclosure to open source projects like under-resourcing, most maintainers being volunteers, and stress caused during times of crisis.

    Cast of the tabletop session!

    So many people!

    I also met up with many folks doing open source security, maintenance, and funding:

    • Met with many folks from the Alpha-Omega cohort. I'm looking forward to having more cross-functional discussions about new approaches to securing open source.
    • Met with Michael Winser from Alpha-Omega to work on our PyCon US 2024 talk State of Supply Chain Security for Python.
    • Met with my friend William Woodruff from Trail of Bits and discussed the system TLS proposal and build provenance for Homebrew (and what could be learned for Python).
    • Met with Samuel Giddins and Martin Emde from the Ruby ecosystem to discuss shared challenges for introducing security into an ecosystem.
    • Met Lauren Hanford from Tidelift to discuss supporting and funding maintainers.
    • Met Mirko from Sovereign Tech Fund and discuss their program for hiring open source maintainers.
    • Attended the talk by Kara Sowles from GitHub on the state of open source funding and learned about "downturn-resilient" funding.
    • Many folks who asked me about security initiatives happening in the Python ecosystem.

    Other items

    Note that I've been summoned for jury duty starting next week, so expect fewer updates over the next two weeks depending on how that goes.

    That's all for this week! 👋 If you're interested in more you can read next week's report or last week's report.

    Thanks for reading! ♡ Did you find this article helpful and want more content like it? Get notified of new posts by subscribing to the RSS feed or the email newsletter.


    This work is licensed under CC BY-SA 4.0

  30. Hello, Rust

    (petemillspaugh.com)

    I just wrote my first Rust program. Er, it’s not really even my program—I just read an example from a book, tried to implement it from memory a few minutes later, ran cargo run, then used hints from the Rust compiler to fix my code. First impression…the hints are really good!

    Here’s the function:


    fn gcd(mut n: u64, mut m: u64) -> u64 {
    assert!(n != 0 && m != 0);
    while m != 0 {
    if m < n {
    let t = m;
    m = n;
    n = t;
    }
    m = m % n;
    }
    n
    }

    I’m starting with O’Reilly’s Programming Rust, but I plan to go through The Book in parallel and see what sticks.

    I also wrote copied my first test in Rust, which quickly caught a bug in my greatest common divisor implementation.


    #[test]
    fn test_gcd() {
    assert_eq!(gcd(14, 15), 1);
    assert_eq!(gcd(2 * 3 * 5 * 11 * 13, 3 * 7 * 11 * 19), 3 * 11);
    }

    It's pretty nice that cargo includes tests out of the box without having to install a dependency.

  31. Megan Marie Myers

    (chriscoyier.net)

    In Bend, you can’t miss the illustration work of Megan Marie Myers. It’s absolutely everywhere. There are murals. There are calendars for sale everywhere. Prints are hung up at coffeeshops, sometimes as whole exhibitions. She’s got books, stickers, hats, postcards, notecards, and heck, even neck gaiters. If her art wasn’t as charming and well done, […]

  32. "Testing can show the presence of bugs but not the absence"

    (buttondown.email)

    Program testing can be used to show the presence of bugs, but never to show their absence! — Edgar Dijkstra, Notes on Structured Programming

    Dijkstra was famous for his spicy quips; he'd feel right at home on tech social media. He said things he knows aren't absolutely true but will get people listening to him. In the same essay he discusses how testing could, in specific cases, show the absence of bugs:

    • If a function has a finite set of inputs, like floor(float), you could prove correctness by simply testing all possible inputs.
    • Even if a function has infinite inputs, tests prove the absence of bugs for tested inputs. You could say "date_of_easter works for any date within 10,000 years of today" and have "proven the absence of bugs" for the subset of dates people care about.1
    • You can use testing to complete a proof. For example, you can divide inputs into "equivalence classes", prove that if the function is correct for one value it will also be correct for all values in that class, and then test one value from each class.

    In the majority case, however, testing cannot show the absence of bugs. Testing f(16) does not give you any guarantees about all x. f(x).

    There are two possible takeaways. The reasonable one is that we shouldn't be blindly confident in testing, should use a variety of approaches as part of defense in depth, and always be careful about what we claim. The unreasonable takeaway is that testing is an inferior verification method, and we should use more "rigorous" approaches instead of testing.

    It's unreasonable because methods that show the existence of bugs are more robust than methods that show the absence of them, and this makes them more useful more of the time.

    Comparing tests and proof

    Let's say you have a pseudocode function like this:

    fun elem_before_x(s: int[], x: int) returns (r: int | NULL) {
        let i: int = index(s, x);
        if (i == NULL) {
            return NULL;
        } else return s[i-1];
    }
    
    Property:
     if r == NULL then x notin s
     else some i in 0..<len(s): s[i] == r && s[i+1] == x
    

    The bug here is that we're out-of-bounds if x is the first element of the array. Testing will find this bug if you remember to test that case. Otherwise, we'd have a bug and a passing test suite. Tests can't verify the absence of bugs.

    Now imagine we do find it and fix it:

    else if (i == 0) {
            return s.last;
        } 
    

    Modifying the property is left as an exercise to the reader. Once again, the passing test suite means we haven't verified the absence of the bug. Now, how does formal verification do on the same problem? I wrote a Dafny version of the same problem, here, both with and without the fix. Looking at v1, we see it finds the s[0] error:

    >   var v := index.Extract();
    >   return Some(s[v-1]);
    
    (19,16): Error: index out of range    
       |
    19 |     return Some(s[v-1]);
    

    This error also goes away in v2: the bug is proved to be absent. But v2 adds a new error:

    > ensures if r.None? then x !in s
    >           else if s[|s|-1] == r.value then s[0] == x
    >           else s[IndexOf(s, r.value)+1] == x
    
    (7,20): Error: index out of range
      |
    7 |             else if s[|s|-1] == r.value then s[0] == x
      |                      ^^^^^^^
    

    When would s[|s|-1] be an index out of range? When the list is empty. But we know that's not possible: if we return a non-null value then the list must not be empty, because we found x in it! But the prover doesn't know that, it can't infer the relationship between the returned value and what we know about the list.2 We have to annotate our code with proof hints for it to recognize that if we return a number, the list is nonempty.

    In short: the prover definitely finds the out-of-bounds bug, but also raises a spurious "empty list" bug. The test might not find the former but also won't raise the latter. Tests can have false negatives but not false positives, while provers can have false positives but not false negatives.

    (I'd say "formal verification can prove the absence of a bug but never the presence", but that's not true. Dafny can sometimes give you a concrete counterexample to a property, and type systems detect true positives like "you passed an array into an Int -> Int function" all the time.)

    Knowing there "might" be a bug is not as useful as knowing there is one. Empirically speaking people mostly ignore compiler warnings and plenty of retrospectives find people ignore "might possibly be a bug" in even mission-critical software. Formal verification often becomes an "all or nothing" deal, either you totally prove the absence of bugs in programs or you get no benefits at all.

    And that sucks! It takes a lot of effort to prove something correct, magnitudes more than running some tests. And that's why I consider testing "more robust": even a little bit of testing may find a true bug, while a little bit of verification doesn't get you much.

    Correctness is not a thing

    I'll end this by getting on my soapbox. Here's something that's really important to understand about "proving the absence of bugs":

    You do not prove something's correct. You prove conformance to specification.

    It's possible for code to conform to the specification and still have bugs. Two common cases are:

    1. The spec is wrong or incomplete, like v2.dfy is in the above example. Spoiler in the footnote.3
    2. The spec is based on an assumption that's not true in practice, like "no other program is writing to the same file."

    This is also important because it puts the specification front-and-center. You can write tests without having a specification in mind, but you need a spec to do any kind of formal method. Doing FM trains you to think about specs, which in turn helps you write better tests.


    Things I am interested in

    1. Crossovers from fields besides classical engineering, who worked professionally as something else before switching to software. What ideas from your old job as a journalist/librarian/social worker/acrobat could be useful in software, and vice versa?
    2. Wargaming in software. Have you designed or participated in a devops/security/system design/whatever simulated game? Was it helpful?

    If you have insights into either of these, feel free to share! Or write about them publicly and share the link with me. You can just reply directly to this email.


    1. Though you could argue this isn't sufficient for security. If easter(-1e8) causes a memory overflow then that's a bug even if a normal user won't ever call it. 

    2. We're lucky that the Dafny standard library defines a postcondition for index functions saying the output is less than the length of the list. Otherwise the prover would raise a potential out-of-bounds error on s[IndexOf(s, x)]

    3. s[|s|-1] == r.value then s[0] == x isn't true for the input [a, r, x, r]. The good news is that Dafny refuses to verify the code because it didn't match the spec. I only realized this when I was error-checking this piece. (This doesn't refute the point I made above about false positives: the false positive was that s[|s|-1] was out of bounds!) 

  33. Weekly Update 396

    (troyhunt.com)

    Presently sponsored by: Kolide believes that maintaining endpoint security shouldn’t mean compromising employee privacy. Check out our manifesto: Honest Security.

    "More Data Breaches Than You Can Shake a Stick At". That seems like a reasonable summary and I suggest there are two main reasons for this observation. Firstly, there are simply loads of breaches happening and you know this already because, well, you read my stuff! Secondly, There

  34. lilos v1.0 released

    (cliffle.com)

    After five years of development, something like seven art projects, one commercial product, and many changes to the dark corners of the Rust language, I’ve decided lilos is ready for a 1.0 release!

    Some parts I’m excited about include:

    • As of this release, the lilos APIs are entirely cancellation-safe.

    • This release contains contributions from five other people, bringing the total number of contributors to seven! (Want to be number eight? Come say hi!)

    • Thanks to one of those contributors, the operating system tests are now running in CI on QEMU!

    (For anyone who’s new, lilos is a tiny embedded operating system that uses Rust async to allow complex multitasking on very limited microcontrollers without requiring dynamic memory allocation. Read more about lilos on my project page, where I link to the docs and provide a curated collection of blog posts on the topic.)

    See the release notes if you’re curious about what’s changed. If you’ve got firmware written for an earlier version of lilos (particularly the 0.3.x series) and would like to update (you don’t have to!), those release notes will guide you through the process. There have been some breaking API changes, but I promise they’re all improvements.

  35. Setting up service workers on Vultr

    (developer.mozilla.org)

    This guide introduces you to service workers and their lifecycle. Learn how to deploy a project using service workers with HTTPS on Vultr.

  36. (chriscoyier.net)

    Most internet travels by wire. Straight through the dang ocean. Josh Dzieza in a feature for The Verge: These fragile wires are constantly breaking — a precarious system on which everything from banks to governments to TikTok depends. But thanks to a secretive global network of ships on standby, every broken cable is quickly fixed. […]

  37. photostream 130

    (martinfowler.com)

    Big Sur, CA

  38. (chriscoyier.net)

    You’re doing yourself a grave disservice if your writing opens with something boring or banal. You’re going to lose me, at least. I’ve got a list of stuff to read and watch as long as your arm. Maggie really digs into this, in an effort to get better. Your challenge is finding the compelling problem […]

  39. Garden-path incidents

    (blog.danslimmon.com)

    Barb’s story It’s 12 noon on a Minneapolis Wednesday, which means Barb can be found at Quang. As the waiter sets down Barb’s usual order (#307, the Bun Chay, extra spicy), Barb’s nostrils catch the heavenly aroma of peanuts and scallions and red chiles. A wave of calm moves through her. Barb pulls her chair …

    Continue reading Garden-path incidents

  40. Tronsky

    (lovergne.dev)

    What I like about this blog are the "down the rabbit hole" and "neet peaky" articles, like [Hardest Problem in Computer Science: Centering Things](https://tonsky.me/blog/centering/) and [The Absolute Minimum Every Software Developer Must Know About Unicode in 2023 (Still No Excuses!)](https://tonsky.me/blog/unicode/).
    https://tonsky.me/

  41. Microsoft supports urllib3 with FOSS Fund 2024

    (sethmlarson.dev)

    Microsoft supports urllib3 with FOSS Fund 2024

    AboutBlogNewsletterLinks

    Microsoft supports urllib3 with FOSS Fund 2024

    Published 2024-04-17 by Seth Larson
    Reading time: minutes

    Back in January we announced that urllib3 would be fundraising to implement support for HTTP/2 in a backwards compatible way to urllib3 v2.x and to ensure the project's development remains sustainable in the long-term.

    Microsoft has awarded urllib3 $5,000 USD as one of its FOSS Fund recipients for 2024 🥳 Thank you, Microsoft for supporting open source software!

    Since announcing our fundraiser we have raised $7,275 USD in 3 months, many of the donations coming from individuals and long-time sponsors like Sourcegraph and Sentry.

    Thanks Microsoft! ❤️

    How to get started supporting your own dependencies?

    Contribute directly to projects you depend on. Many of them already have funding mechanisms documented like Open Collective, GitHub Sponsors, or similar.

    Subscribe to an organization like Tidelift to handle discovery of dependencies and fundable projects and to get additional guarantees like security and long-term support.

    Thanks.dev provides a lightweight option to throw money in open sources' direction.

    Thanks for reading! ♡ Did you find this article helpful and want more content like it? Get notified of new posts by subscribing to the RSS feed or the email newsletter.


    This work is licensed under CC BY-SA 4.0

  42. Formal Methods: Just Good Engineering Practice?

    (brooker.co.za)

    Formal Methods: Just Good Engineering Practice?

    Yes. The answer is yes. In your face, Betteridge.

    Earlier this week, I did the keynote at TLA+ conf 2024 (watch the video or check out the slides). My message in the keynote was something I have believed to be true for a long time: formal methods are an important part of good software engineering practice. If you’re a software engineer, especially one working on large-scale systems, distributed systems, or critical low-level system, and are not using formal methods as part of your approach, you’re probably wasting time and money.

    Because, ultimately, engineering is an exercise in optimizing for time and money1.

    “It would be well if engineering were less generally thought of, and even defined, as the art of constructing. In a certain important sense it is rather the art of not constructing; or, to define it rudely but not inaptly, it is the art of doing that well with one dollar, which any bungler can do with two after a fashion.” Arthur Wellington2

    At first, this may seem counter-intuitive. Formal methods aren’t cheap, aren’t particularly easy, and don’t fit well into every software engineering approach. Its reasonable to start with the belief that a formal approach would increase costs, especially non-recurring engineering costs. My experience is that this isn’t true, for two reasons. The first is rework. Software engineering is somewhat unique in the engineering fields in that design and construction tend to happen at the same time, and a lot of construction can be started without a advancing much into design. This isn’t true in electrical engineering (designing a PCB or laying cables can’t really be done until design is complete), or civil engineering (starting the earthworks before you know what you’re building is possible, but a reliable way to waste money), or mechanical engineering, and so on. This is a huge strength of software - its mutability has been one of the reasons it has taken over the world - but can also significantly increase the cost of design iterations by turning design iterations into implementation iterations. The second is the cost of change. Once an API or system has customers, changing it becomes many times more expensive and difficult. This is fundamentally related to Hyrum’s Law:

    With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody.

    Isolating the behavior of systems with APIs is an excellent idea. In fact, I would consider it one of the most important ideas in all of software engineering. Hyrum’s Law reminds us of the limitations of the approach: users will end up depending on every conceivable implementation detail of an API. This doesn’t mean change is impossible. I have been involved in many projects in my career that have completely re-implemented the system behind APIs. It merely means that change is expensive, and that abstractions like APIs don’t effectively change that reality.

    By saving on the cost of rework, and by getting interface changes out of the way earlier, formal design work can significantly increase the speed and efficiency of the software building process.

    What kinds of systems?

    This doesn’t seem to apply to all kinds of software. Software which is heavily driven by fast-evolving or hard-to-formalize user requirements, from UIs and web sites to implementations of pricing logic, may require so much ongoing rework that the value of up-front design is diluted (or its costs significantly increased). This is the underlying idea behind agile: by running implementation and requirements-gathering in parallel the wall-clock time-to-market can be reduced. More importantly, where requirements gathering is an ongoing process, it allows implementations to be completed even while requirements evolve. In many cases, this parallel approach to development is optimal, and even necessary to make any progress at all.

    On the other hand, much of the software behind large-scale, distributed, and low-level systems has well understood requirements. Or, at least, a large enough subset of well understood static requirements that upfront formal design reduces rework and bug density during the implementation phase (and after, when the system is in production) considerably.

    Much of the debate about design-up-front vs agile comes from folks on different ends of the software spectrum talking past each other. We shouldn’t be surprised that software with very different models for requirements-gathering would have different optimal engineering approaches. It seems like the closer the requirements process is to the laws of physics the more valuable design, and formal design, are in the engineering process. The closer the requirements are to users’ opinions, the less valuable they are.

    This doesn’t mean that there isn’t value in crisply writing down user requirements, and in exploring for formal ways to specify them. Writing down requirements, formally or informally, is extremely valuable. Not writing them down can waste a lot of time, and cause a lot of friction, because people end up pulling in other directions. It does mean, though, that it’s often hard (and maybe not economical) to formally specify all forms of human requirements. I don’t know a way to specify UI aesthetic requirements, documentation readability, or even API naming consistency, for example.

    The other disagreement in this space also seems to come from different ideas about what formal approaches are, and how they’re valuable. I, for example, tend to see much of the circles and arrows school of software design as a waste of time that doesn’t directly engage with the really hard questions. That opinion may stem from ignorance, and is one I hold weakly. But I am sure that my university SE class experience of describing 100 lines of code in 100 reams of UML left me with an unreasonably low opinion of the benefits of designing software. Done badly, or with bad tools, even the most valuable things are useless.

    Which tools?

    Formal methods and automated reasoning are broad fields, with a vast number of tools. Over my career, in my domain in big cloud systems, I have found the follow set useful (but am sure there are tools I would find useful if only I knew them).

    • Specification languages like P, TLA+, and Alloy, along with their associated model checkers.
    • Deterministic simulation tools like turmoil that allow, along with fuzzing, a principled approach to searching a state space through testing.
    • Verification-aware programming languages like Dafny and code verifiers like Kani. I’m not a deep expert in these tools, and have used them a lot less than the others.
    • Numerical simulation techniques. I tend to build my own (as I’ve written about before), but there are many frameworks and tools out there.
    • Whiteboard formal-ish methods. These are the methods like drawing decision tables, truth tables, explicit state machines, etc on the whiteboard or in design documents.

    I like the survey of methods in Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3 too. That’s a great place to start.

    But, to the point of this post, we don’t want to get too tied to the idea that verifying an implementation is the only worthy end goal here. That is a worthy goal indeed, but I have found a great deal of value in using tools like TLA+ and P to think through designs more quickly and concretely before building.

    Faster software, faster

    Back in 2015 when we were writing How Amazon Web Services Uses Formal Methods for CACM, my focus was very much on correctness. On verifying the safety and liveness properties of my designs, and in getting to a correct design faster. In talking to one of the teams that was using TLA+ on an internal lock management system, I discovered something I loved even more. These words, from Table 1:

    verified an aggressive optimization

    This was eye-opening. Not only do tools like TLA+ help us to build systems faster, they help us to build faster systems. They allow us to quickly explore for possible optimizations, find the constraints that really matter, and check whether our proposed optimizations are correct. They, in many cases, remove the hard trade-off between correctness and performance which many systems can get stuck in.

    Conclusion

    Using tools to help us think about the design of systems at the design stage can significantly increase the speed of software development, reduce risk, and allow us to develop more optimal systems from the beginning. For those of us working on large-scale and complex systems, they are simply a part of a good engineering practice.

    In Video Form

    Footnotes

    1. And other related things, like performance, scalability, sustainability, and efficiency.
    2. I love this quote so much I bought a copy of this Wellington’s 1877 book just to see it on paper.

  43. Glory is only 11MB/sec away

    (lovergne.dev)

    A good article against the necessity of horizontal scaling and deploying on the edges. Indeed, I think that for most applications you might want horizontal scaling but you don't actually need it, worst than that, it can actually hurt your application by adding unnecessary complexity and remember what [Grug](https://lovergne.dev/archive/grub-brain) told use: complexity is the enemy.
    https://thmsmlr.com/cheap-infra

  44. What makes concurrency so hard?

    (buttondown.email)

    A lot of my formal specification projects involve concurrent or distributed system. That's in the sweet spot of "difficult to get right" and "severe costs to getting it wrong" that leads to people spending time and money on writing specifications. Given its relevance to my job, I spend an awful lot of time thinking about the nature of concurrency.

    As the old joke goes, concurrency one of the two hardest things in computer science. There are lots of "accidental" reasons why: it's hard to test, it's not composable, bugs can stay latent for a long time, etc. Is there anything that makes it essentially hard? Something that makes concurrent software, by its very nature, more difficult to write than synchronous software?

    The reason I hear most often is that humans think linearly, not concurrently, so are ill-equipped to reason about race conditions. I disagree: in my experience, humans are very good at concurrent reasoning. We do concurrent reasoning every time we drive a car!

    More generally, some studies find that if you frame concurrent systems in human terms ("meatspace modeling"), people get quite good at finding the race conditions. So while concurrency might be difficult to reason about, I don't think it's because of a fault in our brains.

    In my opinion, a better basis is state space explosion. Concurrency is hard because concurrent systems can be in a lot of different possible states, and the number of states grows much faster than anyone is prepared for.

    Behaviors, Interleavings, and Nondeterminism

    Take agents1 {1, 2, … n}, which each executes a linear sequence of steps. Different agents may have different algorithms. Think writing or reading from a queue, incrementing a counter, anything like that. The first process takes p1 atomic steps to complete, the second p2, etc. Agents complete their programs strictly linearly, but another process can interleave after every atomic step. If we have algorithms A1A2 and B1B2, they could execute as A1B1A2B2 or A1B1B2A2, but not A1B2B1A2.

    Under those conditions, here's an equation for how many possible orderings of execution (behaviors) can happen:

    (p1+p2+...)!
    ------------
    p1!*p2!*...
    

    If we have three agents all executing the same 2-step algorithm, that's 6!/8 = 90 distinct behaviors. Each step in the sequence can potentially lead to a distinct state, so there's 6*90=540 maximum distinct states (MDS). Any one of those can potentially be the "buggy" one.

    In most cases, the actual state space will be significantly smaller than the MDS, as different behaviors will cross the same states. Counterbalancing this is just how fast this formula grows. Three 3-step agents gives us 1700 possible behaviors (15K MDS), four 2-step agents instead have 2500 (20K MDS). And this is all without any kind of nondeterminism! If one step in one of the agents can nondeterministically do one of three things (send message M₁, send message M₂, crash), that triples the number of behaviors and the MDS.

    It's pretty common for complex concurrent systems to have millions or tens of millions of states. Even the toy model I use in this article on model optimization has about 8 million distinct states (though with some work you can get it much lower). I mostly think about state spaces in terms of performance because large state spaces take a lot longer to model-check. But it's also why concurrency is essentially hard to reason about. If my theoretical MDS is two million states, my practical state space is just 1% that size, and my human brain can reason through 99.9% of the remaining states… that still leaves 20 edge cases I missed.

    Shrinking the State Space

    Here's a heuristic I use a lot:

    All means of making concurrency 'easier' are concerned first and foremost with managing the state space.

    It's not 100% true (no heuristic is) but it's like 60% true, and that's good enough. State spaces grow quickly and bigger state spaces cause problems. If we want to make maintainable concurrent systems, we need to start by shrinking the space.

    Like look at threads. Threads share memory, and the thread scheduler has a lot of freedom to suspend threads. So you have lots of steps (potentially one per line of code)2 and any interleaving can lead to a distinct state. I can use programming constructs like mutexes and barriers to "prune" the state space and give me the behaviors I want, but given how big the state space can be, I have to do a lot of pruning to get the right behaviors. I can make mistakes in implementation, "misshape" the space (like by adding a deadlock), or not notice a buggy state I need to remove. Threads are very error prone.

    I could instead switch to memory-isolated processes. The scheduler can still schedule whatever interleavings it wants but the processes can't muck with each other's internal memory. Internally a process is still executing A1A2A3A4, but if the only steps that involve external resources (or interprocess communication) are A2 and A4, then we only need to think about how A2A4 affects the state space. Three four-step agents have 35k interleavings, three two-step agents have only 90. That's a big improvement!

    What else can we do? Low-level atomic instructions do more in a single step, so there's no room for interleaving. Database transactions take a lot of physical time but represent only one logical time step. Data mutations create new steps, which immutable data structures avoid by definition.

    Languages have constructs to better prune the resulting state space: go's channels, promises/futures/async-await, nurseries, etc. I think you can also treat promises as a way of forcing "noninterleaving": wait until a future is ready to execute in full (or to the next yield point) and before execution. Please don't quote me on this.

    I think CRDTs reduce state space by permitting interleavings, but arranging things so that external changes are commutative: A1B1 and B1A1 lead to the same final result, so there are not distinct states.

    Again, this is all a very rough heuristic.

    Limits

    To a first-order approximation, smaller state space == good. But this doesn't account for the topology of the space: some spaces are gnarlier than others. One that has lots of different cycles will be harder to work with than one that's acyclic.3 Different forms of nondeterminism also matter: "the agent continues or restarts" leads to more complex behavior than "the writer sends message M₁ or M₂." These are both places where "humans are bad at reasoning about concurrency" appears again. We often work through concurrency by reduces groups of states to "state equivalence classes" that all have the same properties. complicated state spaces have more equivalence classes to work through.

    (Another way "humans are bad at reasoning about concurrency" can be a real thing: we might not notice that something is nonatomic.)

    Some high-level paradigms can lead to particular state space topologies that have fewer interleavings or ones that have more equivalent states. I've heard people claim that fork-join and pipe-filter are especially easy for humans to reason about, which I take to mean "doesn't lead to a gnarly state space". Maybe also event-loops? Where does the actor model fit into all of this?

    Another limit: there's a difference between buggy states and buggy behaviors. Some behaviors can go entirely through safe states but still cause an bug like "never reaches consistency". This is called a "liveness bug", and I talk more about them here. Liveness bugs are much harder to reason about.

    Okay, that's enough philosophical rambling about concurrency. Concurrency is hard, don't feel bad if you struggle with it, it's not you, it's combinatorics.


    Video Appearance

    I was on David Giard's Technology and Friends talking about TLA+. Check it out here!


    1. An agent is any concurrent actor, like a thread, a process, a future, a human, etc. Concurrency literature uses "processes" a lot but that clashes with OS processes. 

    2. Or more! If writes are nonatomic, you can get data races, where two threads write to different bits in a byte at the same time. Then a variable assignment can be multiple steps in one line. 

    3. Is there any research correlating unusual subgraphs in state spaces with probable bugs? 

  45. Hardest Problem in Computer Science: Centering Things

    (tonsky.me)

    Somehow we forgot how to center rectangles and must find our way back

  46. OSI Approved in license metadata for Python project

    (safjan.com)

    When it comes to sharing and distributing Python projects, clarity about licensing is crucial. A license tells users what they can and cannot do with your code, impacting everything from contributions to commercial use. The pyproject.toml file, as outlined in PEP …

  47. (chriscoyier.net)

    I’ve heard the new cooperative version of Scrabble made fun of a few times. These pansy youths just wanna hold hands, drink warm milk, and avoid any conflict. Whatever. Nobody is threatening the value of competition. Me, I think cooperative games are awesome. There are still challenges. You work together to solve them. Like, I […]

  48. Clifffle

    (lovergne.dev)

    A great blog about low-level and embedded programming. The author works at [0xyde](https://oxide.computer/) and is one of the creators of [Hubris](https://cliffle.com/blog/on-hubris-and-humility/), an OS designed for *"32-bit microcontrollers with region-based memory protection and kiB to MiB of RAM and Flash"*. I really like the detailed blog posts about Hubris and it always makes me want to try my hand at an embedded system, it's not on my roadmap yet, but one day it will be!
    https://cliffle.com/

  49. Weekly Update 395

    (troyhunt.com)

    Presently sponsored by: Kolide believes that maintaining endpoint security shouldn’t mean compromising employee privacy. Check out our manifesto: Honest Security.

    Data breach verification: that seems like a good place to start given the discussion in this week's video about Accor. Watch the vid for the whole thing but in summary, data allegedly taken from Accor was published to a popular hacking forum and the headlines inevitably followed. However,

  50. Some notes on for loops

    (buttondown.email)

    New Blogpost

    Don't let Alloy facts make your specs a fiction, about formal methods practices (and a lot of Alloy). Patreon link here.

    Some notes on for loops

    I sometimes like to sharpen the axe by looking at a basic programming concept and seeing what I can pull out. In this case, for loops.

    There are two basic forms of for loops:

    for(i = 0; i < 10; i++)
    
    for val in ["a", "b", "c"]:
    

    The second is usually called a foreach loop, and I'll call the former "forstep", after the ALGOL-60 construct. Almost all languages have one of the two, some have both. While forstep loops show up early, foreach loops only appear in PLs ~20 years later.

    Looking at this, here's some things I see:

    Construct reduction

    Given a foreach loop, I can syntactically translate it into a forstep loop:

    for val in arr:
        # ...
    
    -->
    
    for(i = 0; i < arr.len; i++) {
        val = arr[i];
        // ...
    }
    

    There's a lot of terms we use to express this property. We can say that foreach is homomorphic or reducible to forstep, or that foreach can be constructed out of forstep, or others I can't think of rn.

    If we could also construct forstep from foreach, we could say the two are equivalent or isomorphic. In theory we can do this if we had a range operator:

    for(i = 0; i < 10; i += 2) 
    
    --> 
    
    for i in range(0, 10, step=2);
    

    Most forstep implementations, though, can do things like modify the iterator in the middle of the loop:

    for(i = 0; i < 10; i++) {
        // stuff
        if(cond) i = 0;
        // stuff
    }
    

    You can't convert this to a foreach loop: it's an "irreducible" feature.

    Most people consider modifying the index bad practice, though. 99% of the use cases of forstep loops are constructible with foreach loops, so the two are morally equivalent. I'll treat them as roughly interchangeable going forward.

    Primitives and Expressivity

    For loops are reducible to while loops:

    for(i = 0; i < 10; i += 1) 
    
    -->
    
    i = 0;
    while i < 10 {
      // ...
      i++;
    }
    

    But while loops are not reducible to for loops:

    while(user_cmd != "quit")
    
    -->
    
    ???
    

    Again, lots of terms to cover this: while loops are more "primitive", more "powerful", more "low-level", more "expressive", etc. If you have while loops, you don't need for loops, and if you have goto, you don't need either.

    But you shouldn't use a goto when a while loop works or a while loop where a for loop works. This is a generalization of the rule of least power (RLP): given a problem, use the tool that is most specific to the problem. If you can iterate through a list with a foreach loop, don't use a while instead!

    There's two reasons for this. One is that it's more communicative to other programmers. If they see a foreach loop, they know it's going to be used for some kind of iteration, while that's not clear with the while loop. Two is the capability/tractability tradeoff: the more limited a construct is, the easier it is to understand and work with. A foreach loop can do less than a while loop, but it's much easier to guarantee that it processes each item only once.

    Programming languages evolve to replace the all-powerful "low-level" constructs with more limited constructs that reduce to the low-level ones. It's why you get things like interfaces over inheritance.

    The corollary of this is that if you use a powerful construct, it should be because the limited one can't do what you want. Don't force an infinite for-loop!

    Extensions to for loops

    For loops reduce to while loops, while loops reduce to goto, everything reduces to goto. So there are some kinds of looping goto can do that for and while cannot.

    They get a little closer to feature-parity by adding flow-control inside the loop: continue for advancing the loop early, break for terminating it. I didn't look into when or how these were added, but I'd guess they come from looking at what goto can do that loops could not, and adding features as needed.

    "Imperative programming" is just for loops

    I think one of the most basic functions of a programming language is to iterate through an array. So much so that you can distinguish different programming paradigms by how they iterate.

    • Imperative programming: for loops
    • OOP: a foreach method on iterable objects[^python]
    • Functional programming: recursion, reduce (we'll talk about this later)
    • Logic programming: I'm less familiar with this, but I think it's recursion + pattern matching (which ML and Haskell adopted)
    • Array programming: all operations take arrays by default.

    I think this explains why arrays are universal while dictionaries are relatively new: there's no obvious basic operation that covers what you do with dicts to the degree that for covers what arrays do.

    (Question: do list comprehensions lead to a new paradigm? How do you describe "iteration" in spreadsheet languages?)

    Functional replacements

    When people talk about replacements for foreach loops in FP, they usually bring up map and filter. But these are not equivalent. Rather map and filter are "preferred" to loops because they are more limited. It's another instance of RLP.

    As far as I can tell, reduce is equivalent to foreach. Morally, anyway. This is interesting because reduce is considered part of the "big three" (along with map and filter), but it's also the one that's the "odd one out". It's the hardest for people to understand, and Haskell even has three versions of it. Is it confusing because it's so powerful?

    Map and filter are composable in a way that foreach loops are not, but they're also weirdly less "composable" in one very specific way? Consider something of the form

    map f (filter p arr))
    
    -->
    
    out = []
    for x in arr:
      if p(x):
        out.append(f(x))
    

    The map/filter iterates over the list twice, while the foreach loop iterates only once. I know Haskell can fuse maps together as an optimization but I don't think you safely fuse arbitrary map/filters? I dunno.

    You can fuse map and filter with a reduce:

    reduce ((out, x) => 
            if p(x) 
            then out ++ f(x)
            else out
           ), [], arr
    

    This gets really awkward though.

    I wonder if "reduce is equivalent to for loops" plus "reduce is awkward for fusion" somehow inspired clojure transducers.

    I'll be writing this forever

    Things I could have looked at in more detail but ran out of time and energy:

    • "Iterators" as a general concept.
    • Loop invariants and variants. This is really relevant to formal methods in particular.
    • Specific implementations of for loops in languages. ALGOL's loop expression-lists, Python's for else, Raku's phasers, javascript's for in/for of distinction, Ada and Chapel loop parallelization constructs... If I keep going I'll get something as long as this.

    Following any of those leads would get us to different insights, as opposed to main thing that ended up catching my interest with these notes, which is how the rule of least power applies to language constructs. I should probably write a blog post about it!

    Anyway I think this kind of exercise, looking at something everybody takes for granted and try to learn things from it, is really valuable for understanding software better. I don't know how interesting it is to read about, though. Lemme know if you think you'd like to see this for other stuff, or just share the essays that are downstream of the practice (like writing about RLP).


    Podcast appearance: are we engineers?

    I was on a podcast, talking about the crossover project and what software engineers can learn from traditional engineering. Check it out here!