Email package maintainers when their packages are broken

Would it be beneficial to send emails to pkgs.racket-lang.org package maintainers?

I think this would enable maintainers to know about build problems such as undeclared dependencies, breakage due to a racket upgrade, or breakage due to a dependency update breaking a downstream package. We collect their emails and it seems like a good opportunity to engage devs maintaining great packages.

6 Likes

On principle, I think it's a good idea! However, I think it's very important that the mail volume for each package maintainer is relatively low. I assume many packages are maintained in their authors' free time and we shouldn't annoy them. (Ok, I think we shouldn't annoy them anyway. :wink: ) In particular, it would be probably bad if someone got so many unwanted mails that they decide to withdraw their package from the package server to avoid getting too many mails.

Here are a few ideas along these lines:

  • The mails should not be sent if an error condition continues. A mail should only be sent if a condition changes from "ok" to "not ok", or it should be sent once if the condition isn't "ok" from the start, i.e. when the package server processes a new or updated package for the first time.
  • It should be possible to adjust for which problems mails are sent. For example, I think it's more important to be notified of a necessary dependency that no longer exists on the package server in contrast to a missing README (as long as some documentation exists).
  • It should be possible to opt out of notification mails completely (if a package maintainer really wants this).
  • I wonder if it makes sense to have mails on error/warning conditions adjustable per package. For example, someone might be interested in fixing problems in widely used packages right away, but not in fixing problems in very new experimental packages.

I don't know what's a MVP for this feature. Please discuss! :slight_smile:

3 Likes

Let me first state that I am not volunteering to implement this. With that said: as the maintainer of quite a number of packages on pkgs.racket-lang.org, I would not object to an email for each release (that is, four times a year) that tells me the state of each of my packages. I would also opt in to such a system, if there were a check-box for it....

5 Likes

A few brainstorming thoughts:

  • I agree the notifications should occur only for state changes. (No one wants a daily "it's still broken" message.)

  • You could think of each package as having a single state, composed of multiple boolean pieces -- like (built-without-errors? tests-passed? has-documentation? has-good-dependencies?). You'd get some single, initial notification. Thereafter you'd only get notified if/when any of the sub-states change. I think that approach might avoid needing a UI and storage for users customizing about which changes to notify, frequency, etc. Would that be sufficiently non-noisy... maybe??

  • I keep saying "notifications" because this feels like something to delegate to (just for example's sake) AWS Simple Notification Service -- which handles thankless tasks like delivering emails (or other notify methods?), unsubscribes, etc. But I'm not really proposing AWS SNS because cost, because Amazon, etc. Although it's hard for me to imagine anyone doing this kind of work for free, does anyone know of anything like this?

People at Discord recently have a discussion that instead of letting the package server pull updates of packages automatically, it may be better to let users explicitly run a command to notify the package server (raco deploy-pkg? raco publish-pkg?). I think as a part of this publishing, raco setup --check-pkg-deps --unused-pkg-deps + raco test can be run by default, and if it doesn’t pass, it won’t by default let you publish to pkgs.racket-lang.org. This probably will take care of most issues, esp. dependency related ones.

I really don't think this is feasible. Building the code is a tricky and time-consuming process, and we wouldn't want to prevent publishing while waiting for that to happen, especially because it's non-trivial to do it for only a single package.

I truly hope that if anyone is pushing a publish button, the package is installed and well-tested already. With that premise, raco setup should terminate almost immediately, no?

Running tests could take time indeed, and there could be an option to opt-out from that.

I think I said this on Discord, but I see a lot of downsides to an explicit publish command. One of the things I like about Racket's package system is that package catalogs are conceptually very simple, basically just mappings from package names to explicit package sources (in the sense of 2 Package Concepts). Likewise, the pkgs.racket-lang.org catalog is designed not to be technically "special" (of course, it is special in its social role). I think adding an explicit publish command would weaken these features.

Right now, people who want to have explicit releases can (and some do) use a Git release branch. By keeping the notion of releases at the level of package sources and checksums, rather than tying releases to a particular catalog, other catalogs that want to follow updates can find out about releases in just the same way as the pkgs.racket-lang.org catalog. (Some catalogs don't want to follow updates, and they can do that either with raco pkg catalog-archive or with a transformation to make all Git package sources point to a commit id.)

I also don't know how an explicit update command would work with non-Git package sources. Admittedly these already have some characteristics I dislike: to release an update, you either have to log in and change the url or point the existing url somewhere else, both of which complicate long-term reproducibility (though not insurmountably so). But the existence of non-Git packages means that package authors will always be able to make updates without explicitly informing pkgs.racket-lang.org. We could error in such cases (since there would be a checksum mismatch), which seems unfriendly; otherwise, we would continue providing implicit updates, and, if we're doing it for non-Git packages, why not continue to have it be possible for non-Git packages, too?

Back on the original topic:

I think we'd probably need at least some customization, since there are, for example, packages that have no-documentation warnings because they're documented in a different package. But maybe it would be better to address the underlying problems somehow, like a documented-by field in info.rkt

There are various "transactional email" providers that do some of this: some that I'm aware of are Mailgun, Postmark, Sendgrid, and Sendinblue, but I only have personal experience with AWS Simple Email Service (very large free tier, but generally the pros and cons you'd expect with AWS). For send-only transactional email, I think it could be relatively reasonable to just send with a local MTA (and I don't have a rose-tinted view of self-hosting email these days more generally).

But actually, I guess there are already account confirmation emails and such sent from pkg@racket-lang.org, so we could probably just keep doing whatever we do for that.

2 Likes

Ah, you're thinking of running this locally when you publish. That certainly seems reasonable for the reasons you say, if we had a publish command.

Starting a separate sub-thread to brainstorm a different approach:

In contrast to pushing emails at people, give them a way to pull this periodically.

The basic building blocks are simply:

  1. (call/input-url "https://pkgs.racket-lang.org/pkgs-all" get-pure-port read), which returns a hash-table mapping package names to a hasheq of metadata. Each package contains a mapping under the key build, which seems to have the relevant info (AFAICT this is where the pkgs web site gets the info it shows). For example:
         (build
          .
          #hash((conflicts-log . #f)
                (dep-failure-log . "server/built/deps/3d-model.txt")
                (docs . (("main" "3d-model" "doc/3d-model@3d-model/index.html")))
                (failure-log . #f)
                (min-failure-log . #f)
                (success-log . "server/built/install/3d-model.txt")
                (test-failure-log . #f)
                (test-success-log . "server/built/test-success/3d-model.txt")))

Maybe a little package could wrap this in a friendlier interface for use from Racket programs -- as well as some CLI like (say) raco check-my-packages --email foo@bar.com. Maybe there are flags like --ignore-missing-docs.

  1. Most CI services like GitHub Actions allow you to define "cron jobs" that run periodically. So you can run the above on the schedule you want, and be notified in whatever way you already prefer using that CI service.

In other words this delegates all customization and notification concerns out to other systems that people may have already chosen and be using.

This requires a package author to take some action to set this up to "pull", as opposed to pushing things at them unsolicited. You can look at this as an advantage or a drawback; I understand the arguments both ways.

5 Likes

A minimal working example of this approach: Sketch of checking catalog server for package problems · GitHub

2 Likes