Nix NYC 4/27/26

See you at the Nix NYC meetup in two weeks! We’re looking forward to hosting again.

29 W 30TH STR FL 11
NEW YORK NY 10001

nix activate

Tip for when you’re struggling to activate a nixos or nix-darwin config: if you can build it, but the switch fails, you can run the activation script manually. A NixOS or nix-darwin config isn’t special; it’s an “activation script”:

  $ sudo nixos-rebuild switch ...

  $ nixos-rebuild build ...
  $ sudo ./result/activate

And with flakes in particular, you get:

  $ sudo nixos-rebuild switch --flake .#foo

  $ nix build .#nixosConfigurations.foo.config.system.build.toplevel
  $ sudo ./result/activate

(Or `darwin-rebuild ...` and `.#darwinConfigurations`, respectively)

The latter can be useful to bootstrap a darwin configuration on a system without darwin-rebuild, or with a broken darwin config. Or just to peek inside a given os configuration’s files, without installing it.

wait-for-port

In our NixOS tests we often spin up datastores -- dynamodb, elasticmq, redis, etc. But they are way too eager to say they are ready, which causes dependent tasks to fail to connect.

We realized we could add a oneliner to the systemd config for these services which waits until the port for the datastore is open. Then, downstream tasks waiting for `default.target` will not start until the datastores are actually ready.

`until nc -z localhost "$1"; do sleep 1; done`

We put it in wait-for-port.

Usage in our case looks like adding to the systemd config and leveraging the `postStart` option. E.g.

  systemd.services.elasticmq = {
    postStart = "wait-for-port ${toString config.services.elasticmq.port}";
    path = [ wait-for-port ];
  };

As an aside, we found that the first result for netcat on NixOS Search, netcat-gnu, works on darwin but did not work in a NixOS test in Linux, in such a way that caused `wait-for-port` to hang forever... It was last updated in 2006 and lives on sourceforge.

Upstream contributions to Dune

While building dune2nix, Shun ran into some lockfile issues in Dune and submitted patches to fix them:

fix(pkg): prefer stronger hash in lock (#13901): Dune picked the first checksum in Opam metadata. Lexicographically md5 comes before sha256/sha512, so Dune would use MD5 even when stronger hashes were available. The PR makes Dune prefer the strongest available hash.
fix(pkg): allow locking relative path outside the workspace (#13915): Relative paths outside the workspace couldn't be locked. Now they can.

Dune has a Nix flake, so you can try these changes with: `nix run github:ocaml/dune#dune`

dune2nix

We open-sourced dune2nix, a Nix library to turn Dune-based OCaml projects into Nix derivations.

Like uv2nix and package-lock2nix, `dune2nix` parses Dune's lockfiles fully at Nix eval time, which gives us: no codegen, no Import From Derivation (IFD), no hardcoded hash.

Released under AGPLv3 (but open to other licenses)

xargs -n 1 ≠ xargs -I %

Apparently `xargs -n 1 𝑥` ≠ `xargs -I % 𝑥 %`

  $ echo a b c | xargs -n 1 echo
  a
  b
  c
  $ echo a b c | xargs -I % echo %
  a b c

No, you must instead:

  $ echo a b c | xargs -n 1 | xargs -I % echo %
  a
  b
  c

Why are you like this, POSIX? T_T

CDKTF provider bindings in Nix

We use CDKTF but now that it's deprecated, providers stopped publishing bindings. While we're migrate off it, we need to generate provider bindings ourselves in the meantime. The official way of doing this is by running `cdktf get`, but we got nerdsniped (as always), and wrote a little derivation that generates provider bindings at Nix build time, in a sandboxed environment:

{
  cdktf-cli,
  writableTmpDirAsHomeHook,
  nodejs,
  terraform,
  stdenv,
  writeTextFile,
  lib,

  # We use https://github.com/nix-community/nixpkgs-terraform-providers-bin
  #
  # Something like:
  #
  # inputs.nixpkgs-terraform-providers-bin.legacyPackages.${system}.providers;
  terraform-providers,
}:

let
  # Providers you want to generate bindings for
  providers = with terraform-providers; [
    hashicorp.aws
    hashicorp.random
    hashicorp.null
  ];

  # Language of the bindings
  language = "typescript";

  # Minimal cdktf.json used for geneting the bindings.
  cdktfJson = writeTextFile {
    name = "cdktf.json";
    text = builtins.toJSON {
      inherit language;
      app = "unused-can-be-anything";
      terraformProviders = map (
        provider:
        # Assuming registry.terraform.io because nixpkgs-terraform-providers-bin has
        # an everlasting TODO: https://github.com/nix-community/nixpkgs-terraform-providers-bin/blob/4f8dfea41cd94403a6c768923b3ddcb15fd4c611/default.nix#L26
        lib.replaceString "registry.terraform.io/" "" provider.provider-source-address
      ) providers;
    };
  };
in
stdenv.mkDerivation {
  name = "cdktf-bindings";

  nativeBuildInputs = [
    cdktf-cli
    nodejs
    (terraform.withPlugins (_: providers))
    # cdktf wants to write in homedir for cache
    writableTmpDirAsHomeHook
  ];

  dontUnpack = true;

  # Disable telemetry, requires internet access.
  CHECKPOINT_DISABLE = 1;

  buildPhase = ''
    cp ${cdktfJson} cdktf.json
    cdktf get
  '';

  installPhase = ''
    mkdir -p $out
    cp -r .gen/* $out/
  '';
}

Writing this was fun, but maintaining it would not be fun. Given that CDKTF is officially deprecated, we have chosen to just directly vendor the bindings for now, while we migrate off of CDKTF entirely. That being said, we thought this was a cool little snippet and rather than bin it, we wanted to send it out into the ether on its own journey. Maybe he can find someone out there who can properly appreciate him :)

さよなら、CDKTF。

sync recent nix store items to cache

To populate your binary cache with e.g. the last 24 hours’ worth of derivations from your machine’s /nix/store you can use:

  $ nix path-info --all --json \
    | jq -r 'with_entries(select(.value.registrationTime > (now - 60 * 60 * 24))) | keys | .[]' \
    | xargs -r nix copy --to ....

Or for Cachix you can use:

    ...
    | xargs -r cachix push my-cache-name

Related: neither of these tools’ native concurrency or chunking primitives seem to be quite as reliable as just plain old multi process parallelism using xargs. In the end, this always wins:

    ...
    | xargs -P 20 -n 1000 -r ...

Kind of sad. :(

redis-py: retrying UNBLOCKED

Our brrr workers use BLPOP to consume jobs from Redis. After enabling redis-py's built-in retry (which covers ConnectionError and TimeoutError), alerts kept firing during failovers:

  ResponseError: UNBLOCKED force unblock from blocking operation, instance state changed

Redis sends this when it boots a blocked client during failover. You can't just add `ResponseError` to `retry_on_error` -- it's the base class for OOM, READONLY, NOPERM and others, most of which indicate a persistent problem. And redis-py doesn't expose a structured error code for types it doesn't recognize. For `UNBLOCKED`, you just get a generic `ResponseError` with the raw message string.

So we subclassed `Retry` to parse the RESP error type ourselves and only retry on `UNBLOCKED`:

  def _redis_response_error_type(exc: ResponseError) -> str:
      message = str(exc).strip()
      if not message:
          return ""
      return message.split(None, 1)[0].upper()

  class CustomRetry(Retry):
      async def call_with_retry(self, do, fail):
          # same retry loop as Retry, but with an extra branch:
          while True:
              try:
                  return await do()
              except ResponseError as error:
                  if _redis_response_error_type(error) == "UNBLOCKED":
                      ... # backoff and retry
                  raise  # OOM, READONLY, etc. -- don't retry
              except self._supported_errors as error:
                  ... # backoff and retry (ConnectionError, TimeoutError)

Replace `Retry` with `CustomRetry` and done. Alerts resolved!

csvtk

We’re fans of csvtk, a CLI toolkit to manipulate CSV/TSV files and pipelines in scripts. It makes for some elegant combinations with jq and awscli2 when building cleanup scripts etc.

It wouldn’t be a CLI if it didn’t have some odd gotchas. Today:

  $ printf '2026-03-04T17:00:00-04:00\tfoo\n1998-01-01T00:00:00+00:00\tbar\n' > data.tsv
  $ <data.tsv csvtk add-header -tn date,name | csvtk filter2 -stf '$name < "goo"'
  date	name
  2026-03-04T17:00:00-04:00	foo
  1998-01-01T00:00:00+00:00	bar
  $ <data.tsv csvtk add-header -tn date,name | csvtk filter2 -stf '$name < "doo"'
  date	name
  1998-01-01T00:00:00+00:00	bar
  $ <data.tsv csvtk add-header -tn date,name | csvtk filter2 -stf '$name < "aaa"'
  date	name
  $ <data.tsv csvtk add-header -tn date,name | csvtk filter2 -stf '$name > "aaa"'
  date	name
  2026-03-04T17:00:00-04:00	foo
  1998-01-01T00:00:00+00:00	bar

So far, so good. But:

  $ <data.tsv csvtk add-header -tn date,name | csvtk filter2 -stf '$date > "aaa"'
  [WARN] row 1: Value '1.772658e+09' cannot be used with the comparator '>', it is not a number
  [WARN] row 2: Value '8.836128e+08' cannot be used with the comparator '>', it is not a number
  date	name
  $ <data.tsv csvtk add-header -tn date,name | csvtk filter2 -stf '$date > "2026"'
  [WARN] row 1: Value '1.772658e+09' cannot be used with the comparator '>', it is not a number
  [WARN] row 2: Value '8.836128e+08' cannot be used with the comparator '>', it is not a number
  date	name

What‽

Gemini has no idea. Thankfully, we have Shun, who figured out that:

Date constants (single quotes, using any permutation of RFC3339, ISO8601, ruby date, or unix date; date parsing is automatically tried with any string constant)

- https://github.com/Knetic/govaluate

Sure enough, if you use a “fuller” date:

  $ <data.tsv csvtk add-header -tn date,name | csvtk filter2 -stf '$date > "2026-01-01"'
  date	name
  2026-03-04T17:00:00-04:00	foo
  $ <data.tsv csvtk add-header -tn date,name | csvtk filter2 -stf '$date < "2026-01-01T00:00:00+00:00"'
  date	name
  1998-01-01T00:00:00+00:00	bar

Thanks, Shun and Shen ☺

binding access_token to clientAddress

Our CDN now transparently binds all access tokens to the IP of the client. CloudFront Functions make this relatively pain free and fool proof.

When the origin server gives a web browser a login token, it mints a JWT and puts it in a `Set-Cookie` header. This token is effectively equivalent to a username + password + 2FA combo for the duration of the session. We’ve set up two CloudFront functions: one to add a `clientAddress` to every outgoing JWT (and resign), and one to validate it on any incoming token. The origin server is none the wiser, but if any token ever leaks, it can only be used if you can convince CloudFront that you come from the same IP as the original user.

Relevant excerpt from the “clientAddress enricher”:

  const cookie = response.cookies["access_token"];
  if (!cookie) {
    return response;
  }
  const decoded = _jwt_decode(cookie.value, secret);
  const payload = decoded.payload;
  payload["clientAddress"] = event.viewer.ip;
  const toSign = decoded.header + "." + Buffer.from(JSON.stringify(payload)).toString("base64url");
  response.cookies["access_token"].value = toSign + "." + _sign(toSign, secret, SIGNING_METHOD);
  return response;

... and the “jwt validator”:

  if (payload.clientAddress && payload.clientAddress != event.viewer.ip) {
    throw new Error("viewer ip does not match token clientAddress");
  }

An important reason this works for us: our users don’t use mobile. We only serve people on desktops with (relatively) static IPs. This technique won’t work for an arbitrary B2C website.

docsync

We maintain a brrr SDK in TypeScript and Python. They both provide implementations for the same backing datastructures, and those classes provide the same docstrings. To avoid them going out of sync, Shun created a tool called `docsync`. It scans docstrings with a <docsync>SomeKey</docsync> tag using treesitter, and compares them to be equal across both languages. E.g.:

  /**
   * A full brrr request payload.
   *
   * This is a low-level brrr primitive.
   *
   * The memo key must be generated by the instantiator of this class, and it
   * must be deterministic: the "same" args and kwargs must always encode to the
   * same memo key.
   *
   * Using the same memo key, we store the task and its argv here so we can
   * retrieve them in workers.
   *
   * <docsync>Call</docsync>
   */
  export interface Call {
    ...

and:

  @dataclass
  class Call:
      """A full brrr request payload.

      This is a low-level brrr primitive.

      The memo key must be generated by the instantiator of this class, and it
      must be deterministic: the "same" args and kwargs must always encode to the
      same memo key.

      Using the same memo key, we store the task and its argv here so we can
      retrieve them in workers.

      <docsync>Call</docsync>
      """

We hooked it up to `nix flake check` so it’s automatically checked in CI.

It’s in brrr @ 137527a but we’ll probably move it out to its own repo at some point.

Hosting Nix NYC meetup 3/18

We’ll be hosting the next Nix NYC meetup, 3/18/26. See you there!

UNIX_EPOCH + 1 second

Yesterday, Ben noticed this blog’s contents weren’t refreshing, even if you explicitly clicked refresh; seeing changes required a hard refresh. Let’s look at the headers:

  $ curl -D /dev/stderr -s -o /dev/null https://電.anterior.app/auth/login.html
  HTTP/2 200
  content-type: text/html
  content-length: 11233
  date: Fri, 27 Feb 2026 20:42:51 GMT
  cache-control: max-age=86400
  accept-ranges: bytes
  last-modified: Thu, 01 Jan 1970 00:00:01 GMT
  vary: accept-encoding
  x-cache: Miss from cloudfront
  via: 1.1 a086f9674a01c7542c440ffacd39476a.cloudfront.net (CloudFront)
  x-amz-cf-pop: JFK52-P9
  x-amz-cf-id: 7_XCBzHLxLFTjlJuOa1cG0WLhZv_yQ_pZfYopz23SUWy0KJGkgn4IQ==
  x-frame-options: DENY
  content-security-policy: connect-src 'self' https://anterior-master-platform.s3.us-east-2.amazonaws.com/artifacts/ https://anterior-master-platform.s3.us-east-2.amazonaws.com/uploads/; default-src 'none'; font-src 'self'; form-action 'self' https://anterior-master-platform.s3.us-east-2.amazonaws.com/uploads/; img-src 'self'; manifest-src 'self'; media-src 'self'; script-src-elem 'self'; style-src-elem 'self'; upgrade-insecure-requests ; worker-src 'self';
  x-content-type-options: nosniff
  strict-transport-security: max-age=31536000; includeSubDomains; preload

What’s that Last-Modified header? That’s the time to which all files are set when stored in the /nix/store:

  $ nix eval --raw --expr 'builtins.toFile "foo" "hello\n"' | xargs -r date -u -Iseconds -r
  1970-01-01T00:00:01+00:00

Unfortunately, even when you click refresh, a browser will send the If-Modified-Since header, and the server will say: nope, nothing changed since you last loaded this page; 304 Not Modified. And the browser won’t get the new content.

So the solution would seem to be: stop static-web-server from sending the Last-Modified header when that’s the value? A grep through their source code finds this:

  // If the file's modified time is the UNIX epoch, then it's likely not valid and should
  // not be included in the Last-Modified header to avoid cache revalidation issues.
  let modified = meta
      .modified()
      .ok()
      .filter(|&t| t != std::time::UNIX_EPOCH)
      .map(LastModified::from);

They already thought of it. So why isn’t it working for us? Taking a closer look at that timestamp from the nix store: apparently it’s *1 second* after the epoch. Not exactly the epoch. Sure enough, the Nix source code confirms:

  const time_t mtimeStore = 1; /* 1 second into the epoch */

Nooo. What’s easier, patching Nix, or patching static-web-server? Let’s try our hand at editing some Rust through sed through Nix, in an overlay on our monorepo’s nixpkgs instance:

  overlays = [
    (self: super: {
      ...
      static-web-server = super.static-web-server.overrideAttrs {
        prePatch = ''
          ${self.gnused}/bin/sed \
            -i \
            -e 's/\(\.filter.*t\) != .*UNIX_EPOCH/\1 > (std::time::UNIX_EPOCH + std::time::Duration::from_secs(1))/' \
            src/response.rs
        '';
        # Some tests which implicitly relied on the above behavior now
        # break.  Force an mtime update to fix.
        postUnpack = ''
          find . -exec touch -m {} +
        '';
      };
    };
  ];

Rebuild the web server and run it locally to test:

  $ curl -D /dev/stderr -s -o /dev/null http://localhost:12345/auth/login.html
  HTTP/1.1 200 OK
  content-length: 11233
  content-type: text/html
  accept-ranges: bytes
  vary: accept-encoding
  cache-control: max-age=86400
  date: Fri, 27 Feb 2026 20:57:34 GMT

Change a CSS rule, do a regular refresh, and: it works :)

Excess Verbiage

Photo of Ben holding a laptop with Anterior lore: Excess verbiage, little alignment. Naming remains tough. - Anuj

ECS: Task Protection vs stopTimeout

AWS Struggle of the day: graceful exit of ECS tasks handling long running async jobs.

The clearest signal that ECS wants you to terminate is a SIGTERM, eventually followed by a SIGKILL. The maximum grace period ECS grants you is 2 minutes. 2 minutes is too short for our long running async tasks. :(

It seems we are not alone. For such cases, ECS introduced task termination protection: tasks can self identify as protected, escaping downscaling until they’re done. This definitely solves the problem for fleet with <1✕ sustained job / worker load, notably auto scaling fleet without parallel handling of jobs by workers. But if your workers support handling concurrent jobs, it’s unlikely they’ll ever be completely out of any work. And until they get a signal, they don’t know whether or not they’re “old”. :((

We settled on workers just scheduling themselves to gracefully exit every hour, so even in times of sustained load there will be task rescheduling events which will give ECS the opportunity to upgrade the tasks. But it’s convoluted, and it’s a hack on top of another hack. Wouldn’t it be nicer if you could just set a delay of 2 hours between SIGTERM and SIGKILL, instead of 2 minutes?

nix flake archive

Our new favorite nix command is `nix flake archive`: copy all flake inputs to your store, and/or to a binary cache. Goes very nicely with `nix copy` to ensure private substituters always have all your flake inputs cached.

To pipe this into `nix copy` (or Cachix’s `cachix push`), use:

  nix flake archive --json \
  | jq '.. | .path? | strings' \
  | xargs nix copy --to ...
  # or: cachix push my-cachix-bin

The implementation is surprisingly simple.

nix building a flake app

Does anyone know how you’re supposed to just build a flake app (not program) without running it? Best we could come up with is:

  nix eval --raw --impure --expr \
    'let
       f = builtins.getFlake "git+file://${toString ./.}";
       prg = f.apps.${builtins.currentSystem}.foobar.program;
     in
       builtins.head (builtins.attrNames (builtins.getContext prg))' \
  | xargs -r nix-store -r

Surely there has to be a better way...

codegen flake module

We open sourced our codegen flake module for declaring auto generated files in your flake.

Usage is as simple as:

  $ nix run .#codegen

and:

  $ nix flake check

validating JWTs in CF edge functions

We installed an edge function in Cloudfront to validate any JWTs were signed by a known JWT key. Copied almost verbatim from the Cloudfront docs.

We explicitly whitelisted certain subdirectories from this check, `/auth/*` among others, to allow unauthenticated users to log in. That’s why we host this page on `/auth/login.html` ☺

The benefit: extremely small surface area for the code which does JWT validation. Severely limits impact of large amount of potential bugs in the origin.

flake module: checkBuildAll

When you publish a flake, a sane base level sanity check is usually: do my exposed packages at least build? The checkBuildAll flake module does that:

  inputs.anterior-tools.url = "github:anteriorcore/tools";

  ...

  flake.parts.lib.mkFlake { inherit inputs; } {
    imports = [
      inputs.anterior-tools.flakeModules.checkBuildAll
      ...

Now, `nix flake check` builds everything exposed through your flake’s `packages`.

From our nix tools repo.

NY Nix Meetup

We’ll be at the NY Nix Meetup this Wednesday. Looking forward to it!

brrr: high performance workflow scheduler

We also open sourced brrr; a library-only, high performance, bring-your-own-infra workflow scheduler. Crucial feature: no central orchestrator → no single point of failure.

TypeScript and Python implementations provided. Nix powered demo in the repo. Under active development.

elasticmq and dynamodb in services-flake

Shun submitted patches to include elasticmq and dynamodb-local in services-flake. They both got merged, so you can now easily use them in process-compose:

  services.dynamodb-local.mydynamodb.enable = true;
  services.elasticmq.myelasticmq.enable = true;

PRs #639 and #640.

package-lock2nix

We open sourced package-lock2nix, a tool to build NPM projects with a package-lock.json directly in Nix. Full package-lock.json parsing is done at eval time, meaning no separate `*2nix` command stage to run. Just `nix build` your project directly, and manage the package-lock.json file itself with regular build tools like npm.

Released under AGPLv3 (but open to other licenses)

Anterior dev log

Launched the anterior dev log. We’re hosting it under /auth/login.html because that’s the only path that our edge functions allow through unauthenticated.

Chose a non-ascii app name to test the system’s handling of unicode.