Single Package vs Multiple (-lib -doc -test) & Convenience

notjack · February 17, 2022, 5:38am

I'm strongly against splitting up packages in principle, though I see the practical reasons to do so today. But our tools are letting us down if they're forcing us to manually and mechanically split apart code into arbitrary pieces.

countvajhula · February 17, 2022, 9:05am

I do agree that doing it manually is not ideal. But I don't think that packages are the problem here. A package is just a grouping of files. Packages can be composed to yield composite packages. Some libraries are offered as simple packages while others are offered as composite packages. In the latter case a common choice for the component packages is to divide them along the lines of source code, tests, and docs, since that's how the dependencies often partition cleanly. This seems like a good if low-level solution. If high-level abstractions could be layered on top of this to provide a more user-friendly experience, I would be in support of that. Maybe such a solution would be analogous to submodules in Racket -- where in an info.rkt file, we could indicate that the package includes subpackages, where these subpackages are of the same status as the composite (just like submodules in Racket). I'm not sure if Racket modules are composable, though, so not sure how far the analogy stretches.

simonls · February 17, 2022, 9:21am

I think clearly knowing what goes into the package without having to lookup a manifest file can be seen as a good thing. But note that this doesn't require multiple packages you can have a repository containing a single package as a sub-tree with examples as sibling directories outside the package.
It is just that when you feel forced to split into multiple packages by status-quo situation you discover this technique automatically, where with single package you often don't get the idea to put the package one level deeper.

I also want to note compile-omit-paths I am not sure if this excludes files or just treats them as if they weren't there, but this blurs the line of what is clearly included a little bit.

simonls · February 17, 2022, 10:01am

To add to that a little bit.
Racketeers that always use the main distribution probably won't really notice it, because most things are already installed.
But the one time I wanted to use minimal racket and install my package one of its dependencies had no -lib and from there it started to pull in all kinds of dependencies and in the end it had pulled in half of the main distribution or more. Making the minimal racket kind of pointless in that situation.

Also just because a lot of people have the main distribution installed anyway, doesn't mean that being able to only install what is necessary is any less useful and desirable. If this works (ideally without lots of manual splitting) racket can be used faster, easier and in more cases. Maybe a bit of a straw-man, but also maybe not: consider a blind person sitting there with a slow internet connection waiting for all the gui / drracket stuff to be installed, while he just wants to work with vi and run some text/console scripts with some unfortunate dependencies. I can see, but still I don't want to install things I don't use.
(Most of the time I use the main distribution, but I also imagine a future where there might be other distributions)

I guess it could be argued that whatever dependency that pulled in most of the stuff just needs a few more splits, but this doesn't seem "rackety" to me.

countvajhula · February 17, 2022, 5:22pm

compile-omit-paths is a great point. This excludes the paths from being built but it doesn't exclude them from being present, i.e. the filesystem is still the manifest here, verbatim (from what I can tell by looking at the modules list in the package index page). But you know what, I think my preferred solution here would be for compile-omit-paths to be eliminated, and to simply put the package one level below the top level in the repo, as you alluded to. Would be interesting to see if there are any actual cases where this cannot be the solution.

alexh · February 17, 2022, 11:22pm

I had a look at the times it takes to install Racket and additional packages for building my application and here is what I found:

for the Linux build, it takes 30 seconds to download and install the full Racket distribution.
On a Windows VM, download and install of the full Racket distribution takes up to 2 minutes.
installing 14 packages my application depends on takes 30 seconds on Linux and 45 seconds on Windows -- this is without building documentation (i.e --no-docs passed to raco install)
when also building documentation for these 14 packages, it takes 1m 4 seconds on Linux and 1m22seconds on Windows.

Given that it takes about 5 minutes to build my application and about 15 minutes to run the tests, the Racket installation and additional package installation is not that important. Also, while I do use --no-docs in my build, the XKCD below me that the savings are not that impressive even if doing daily automated builds (I do perhaps 1 - 2 builds each week).

As for disk space, I recently set up a new Raspberry PI machine and the smallest SD card I could buy at reputable sellers in Australia was a 32 Gb. I still have 26 Gb of free space on that one.

Perhaps others can measure and reply with their build and installation times here, so we get an understanding of the potential savings we would get by further optimizing package layout and installation options.

Alex.

simonls · February 20, 2022, 12:04pm

I agree that currently under the right circumstances, with the right flags it is fast.
I still think there are cases where it does unnecessary things.
Maybe performance is just good enough for now.

Performance isn't the only question in this topic.

This topic is also about how we interact with packages.
What we expect of the package system. And also what we have to do to use it.
It is about understanding how others use it.

Personally I found the discussion hugely valuable and it has given me new things to think about and try in practice. From different ways I could organize my own packages, over different views on documentation and tests, to best practices of installing/using packages and an emphasis on distribute applications as binaries instead of installing.
(which is more doable even for different architectures now that we have raco-cross, but still something you have to know, that you may want to do that; and in my experience you end up installing packages with raco-cross too in some way, just while building the binary, so you can't completely get rid of the install)

I started this topic with:

I think the answers have helped create better understanding.
There are still wrinkles with the current system, maybe you care about those, maybe you don't.
Personally I think we should talk about those, not just sweep those under the carpet.

If we don't talk about it, then how can we further our understanding?
I don't think having this discussion is a waste of time.

I think your test is a valid data point for "Performance of status-quo system".
But it doesn't show e.g. whether manual splits are having a positive effect on that performance.
So we don't know whether it is fast, because enough people have made good decisions on manual splits, or whether the manual splits are unneeded.
Why would we make manual splits if they weren't needed?
What is the point of minimal racket, when the answer is "don't use it"?

alexh · February 20, 2022, 10:56pm

This is exactly the point I was trying to make: the effort required to make and maintain manual splits does not justify the benefits. In my opinion, anyway.

I provided some data to justify my opinion and invited others to consider the same effort-benefit tradeoff and provide their own (numeric) data to support their opinions. Of course, others don't have to do this and they have every right to structure their own packages as they think is best.

One use of minimal-racket is to allow a developer to work and make modifications to packages from the main Racket distribution without having to install from source and compile the entire Racket distribution. It is also used in continuous integration builds for racket packages which are part of the main distribution.

For example, to build and test the plot package, which is part of the main racket distribution, the following steps are used:

install racket-minimal for a released Racket version (e.g. 8.4)
setup a catalog to list the local plot package checkout and configure it with higher priority than the default catalog.
install the plot package

The above steps will result in an almost complete Racket installation, but the plot package is installed from source and new modifications can be tested without having to build all of Racket.

As for the rest of the discussion, I agree with you that it is interesting to see how others use the package system and learn from it. This discussion could also potentially result in improvements to the package system, which is also good.

Alex.

simonls · February 23, 2022, 12:08pm

Sid Kasivajhula's @countvajhula blog post contains additional information regarding the practice of splitting packages, why you would want to, how you do it and also links and mentions of related ideas:

winny · February 25, 2022, 6:10pm

The outcome of this discussion might be of benefit for Racket adoption at large. If users have to wait anywhere from minutes to hours (Racket runs on my Pentium M) to (unknowingly) install racket-doc when deploying or developing in certain environments, they may be inclined to find other solutions.

--binary does work in some cases:

docker run -ti --rm racket/racket /bin/bash -c 'time raco pkg install --auto --binary base58' finishes in 16 seconds
docker run -ti --rm racket/racket /bin/bash -c 'time raco pkg install --auto base58' finishes in 228 seconds

Here's an example of why --binary may not work due to building packages only for the current release:

$ docker pull racket/racket
Using default tag: latest
latest: Pulling from racket/racket
Digest: sha256:82b39728b72da3960213405cd937c92844ba864fadfdc8ab8d5b9eabde7d2d47
Status: Image is up to date for racket/racket:latest
docker.io/racket/racket:latest
$ docker run -ti --rm racket/racket raco pkg install --binary mime-type-lib
Resolving "mime-type-lib" via https://download.racket-lang.org/releases/8.3/catalog/
Resolving "mime-type-lib" via https://pkg-build.racket-lang.org/server/built/catalog/
Downloading https://pkg-build.racket-lang.org/server/built/pkgs/mime-type-lib.zip
raco pkg install: package content is not compatible with the requested conversion
  package: mime-type-lib
  requested conversion: binary
  package content: built
  content for version: 8.4
$ docker run -ti --rm racket/racket racket --version
Welcome to Racket v8.3 [cs].

Topic		Replies	Views
How to Organize Your Racket Library [blog] Show & Tell package , tips , documentation	11	1321	February 26, 2022
Hard to get Packages without documentation Questions & Answers	9	90	September 11, 2024
Use Git directly for versioning? General package , dependency , version-control	6	172	May 3, 2024
"Deleting" a package? Questions & Answers	2	223	July 31, 2022
Local versus github links in scribble Questions & Answers	4	101	March 22, 2024

Single Package vs Multiple (-lib -doc -test) & Convenience

Related topics