It also includes some exercises to gain insight into packages and collections, and some discussion of other approaches.
I started this post several weeks ago after migrating my own library, Qi, to this structure. This was necessary to easily include third-party packages in the default "qi" distribution without introducing too many dependencies in the core functionality (and was suggested by Stephen De Gabrielle @spdegabrielle ). As it took me a while to set up, I felt I ought to blog about it to document it for the next person. Now that I've written the post, I understand why it wasn't documented to begin with. They say that fools rush in where angels fear to tread. Package management is a large and complex, and at times tedious topic. I've done my best to smooth over the tedium and get across the broad ideas in this context, while providing an explicit how-to. It was not easy, so I hope there will be some poor soul down the road who will consider my foolish efforts worthwhile
Last week Simon Schlee @simonls coincidentally brought up this topic for discussion on Discourse and it surfaced a lot of critical opinions on this approach. This was timely, and I think it complements the above post to provide a fuller picture of the various considerations on this complex subject, and I've linked to that topic from the post for greater visibility of these issues. Maybe with clarity will come needed reform.
I like your post a lot, I think starting out with racket it would have helped me to get up to speed in understanding that package organization strategy.
The only thing I would like to state explicitly — because I am unsure that it is conveyed 100% to the reader — is that single-collection vs multi-collection is orthogonal to single-package vs multi-package.
You can use them in any combination you like, e.g. in my define-attributes package I use single-collection with multiple-package. Sometimes packages organized as -lib/-doc/-test use (define collection 'multi) in their combining package, even when the sub-packages all use a single collection. Starting out that gave me the impression that multi-collection was required to use multi-package but this is not true.
This is why I prefer it when the combining package uses something that matches the sub-packages, e.g. only uses multi-collection if at least one of the sub-packages does too.
I also wonder whether there should be a way to state within the info.rkt that this package has no collection / is an empty package that is strictly being used to combine the sub-packages, it feels wrong to specify a collection, when you never want it to contain something (in that particular package).
Maybe (define collection 'none) could indicate that the package has no own collection and is only composed from its sub-packages.
Thanks, I'll incorporate this feedback into the post! I agree that (define collection 'multi) in the composite package is confusing and ambiguous. I also like the idea of adding a none option here to explicitly mark out a package as a composite -- I think in your define-attributes library if you adopted none, you would need to extract the tests and docs into a separate define-attributes-support package, right? Not that that's necessarily the right thing here but just to understand the implications of none.
Yes that would be the idea / result. In the define-attributes package I used the 2 package strategy suggested by @ryanc, so I think there using a single-collection declaration for both works well. (because neither is empty)
But in general it may make sense to collect ideas for a while until some bigger vision emerges for how things could be improved in a coherent manner. So mostly I wanted to mention that idea, so that it isn't forgotten.
Btw, so as not to fragment the discussion on this topic, I would suggest for anyone reading, to post your comments on the other topic if it is more broadly about the lib/test/doc scheme or package management, and post here if it is feedback on the blog post specifically, e.g. if I should add anything / something is unclear / etc. Thanks!
A very useful and helpful post @countvajhula , thank you very much! I have long been troubled by the way in which Racket packages are structured, and your filesystem analogy finally fixed everything for me.
I realize now that the three-way (or multi-way) split and the global namespace is reminiscent of Linux filesystems with directories bin/, share/, doc/, lib/, etc.
I had an idea while back to provide one version of a library, but provide backwards compatibility and versioning via submodules. Obviously prioritize not breaking things, but when breaks are necessary introduce a new submodule for the new version but leave previous version submodules in place. Additionally, don't try and leave previous versions' code pristine, adapt it as necessary to minimize duplication. The different submodules would exist solely for incompatible behaviors (so in many cases you would just import and provide common functions across various versions).
Racket submodules are neat in cases like this because all users will end up benefiting from many improvements/bug fixes without changing their code while the library developer can continue innovating on the API and introducing backwards incompatible changes. The main problem with this approach is that using submodules in Racket can be rather tedious. Additionally, since Racket made the very unfortunate decision to make "import *" style imports the default, introducing new functions can be considered backwards incompatible.
Neat. I'm not sure I follow how this works exactly, but I bet it would be an interesting post if you chose to write it out in more detail with examples. I know folks have brought up handling backwards compatibility in the past, and there doesn't seem to be a standard way in the Racket community. Your scheme here sounds like one interesting way to do it.
Re: introducing new functions being backwards incompatible, do you know about version-case? You could potentially use this to leave out the new definitions altogether depending on the version.