Hi all,
If you use Alexis King's Generic Collections (data/collection
) library for generic sequences and collections, you may be interested in this package I put up recently.
Ever since I discovered data/collection, whenever I needed to do anything on a sequence, I made it a habit to write and use a generic API instead of doing it in a one-off way. I finally cleaned these up and put them up here:
https://docs.racket-lang.org/seq/index.html
I seem to recall during Advent of Code last year some folks asked about standard APIs like init
(the opposite of rest
) that were nowhere to be found. Obviously, implementing some of these isn't hard, but it's still nice to have them readymade, so this library includes it and others that resemble SRFI-1 and other such standard collections of list-like APIs. It also introduces some naming conventions that I hope will make the APIs easier to remember and use.
In addition to extending the data/collection
library with more APIs, the seq
library in some cases also overrides data/collection APIs to provide additional functionality such as negative indexing (e.g. (set-nth -1 "o" "hellp")
). As seq
builds on top of data/collection and is intended to be used alongside it, it would probably make sense to merge some of these upstream.
In terms of performance, there's a lot of room for improvement and I'd appreciate pull requests. For instance the init
function is, like everything else, implemented in terms of streams. Yet, for vectors (as Alexis showed in her original talk), you could implement this in constant time by just returning a view of the immutable underlying data structure, truncated at index (length-1). There are no such optimizations employed in seq interfaces, and again, pull requests are welcome!
Finally, the library also includes the seq/iso
module which leverages Generic Relations to provide optional "isomorphic" APIs. This means that, with (require seq/iso)
, the same APIs provided by the library have an additional layer that restores the type of the output to the type of the input that you passed in -- even for custom sequence types that you define.
> (require seq/iso)
> (rest "hello")
"ello"
> (by 3 "a12b34c56d")
"abcd"
> (remove "v" "hevvy")
"hey"
> (map + #(1 2 3) #(1 2 3))
'#(2 4 6)
> (trim-if char-whitespace? " hello there\n\n")
"hello there"
> (trim-if negative? (list -1 2 -1 3 -7))
'(2 -1 3)
> (trim 1 (list 1 1 2 3 5 3 2 1 1))
'(2 3 5 3 2)
> (rotate-right 2 "llohe")
"hello"
This kind of isomorphic behavior is perhaps not the right solution (again, as Alexis hinted at in her talk linked above). If you can return something that functions the same as your expected result (such as a "view" into your original data structure) without actually computing a tangible result, then that is better than producing one in linear time. And if there are different types involved in the input (e.g. (map + '(1 2 3) #(1 2 3))
), choosing one arbitrarily for the result once again seems hasty. Nevertheless, this kind of isomorphic behavior is extremely convenient. And in a lot of cases (e.g. tests, documentation, exploratory hacking, ...) convenience is the main criterion. It is also likely that if streams had a better default print representation (as Alexis suggests in the video!) then seq/iso would not be needed. But for now if it is convenience you seek, then just (require seq/iso)
for ignorance and bliss, my friend.
Finally, I just want to observe that having general-purpose APIs like this is useful, and having expressive ad-hoc ways to describe sequences such as the for*
forms is also great (and they can certainly be used together!). But it feels like there is room for a declarative language here to describe sequence operations, with primitives like by
and trim
and modifiers like while
, at
and where
, so that these are not distinct APIs but are part of the language. A language more fluent than the for*
comprehensions, but less gratuitous than something like Common Lisp's LOOP. Maybe something like Iterate, but generic and functional, and ideally, also isomorphic (or at least with a form like (as vector)
to indicate the return type, or employing a return type with a helpful default print representation). That, I feel, would be quite nice.
Maybe something like this could make sense for the new generic collections framework for Rhombus that I believe @notjack is working on. But until such a point, enjoy
-Sid
P. S. I used the "announcements" category for this post since it is a new library "announcement," but I see that @spdegabrielle had posted about this category a few weeks ago soliciting opinions on what this category should be used for. The other category that seemed relevant is "show and tell" but that also seems quite broad. There was also some discussion some time ago re: a "low volume" announcements category which could have independent visibility rules and customizations. I don't know if this is that category or not. If people have opinions about this, please share on Stephen's thread, and I'll update this post as needed, thanks