Introducing Seq: A generic, isomorphic sequence library

Hi all,
If you use Alexis King's Generic Collections (data/collection) library for generic sequences and collections, you may be interested in this package I put up recently.

Ever since I discovered data/collection, whenever I needed to do anything on a sequence, I made it a habit to write and use a generic API instead of doing it in a one-off way. I finally cleaned these up and put them up here:

https://docs.racket-lang.org/seq/index.html

I seem to recall during Advent of Code last year some folks asked about standard APIs like init (the opposite of rest) that were nowhere to be found. Obviously, implementing some of these isn't hard, but it's still nice to have them readymade, so this library includes it and others that resemble SRFI-1 and other such standard collections of list-like APIs. It also introduces some naming conventions that I hope will make the APIs easier to remember and use.

In addition to extending the data/collection library with more APIs, the seq library in some cases also overrides data/collection APIs to provide additional functionality such as negative indexing (e.g. (set-nth -1 "o" "hellp")). As seq builds on top of data/collection and is intended to be used alongside it, it would probably make sense to merge some of these upstream.

In terms of performance, there's a lot of room for improvement and I'd appreciate pull requests. For instance the init function is, like everything else, implemented in terms of streams. Yet, for vectors (as Alexis showed in her original talk), you could implement this in constant time by just returning a view of the immutable underlying data structure, truncated at index (length-1). There are no such optimizations employed in seq interfaces, and again, pull requests are welcome!

Finally, the library also includes the seq/iso module which leverages Generic Relations to provide optional "isomorphic" APIs. This means that, with (require seq/iso), the same APIs provided by the library have an additional layer that restores the type of the output to the type of the input that you passed in -- even for custom sequence types that you define.

> (require seq/iso)
> (rest "hello")
"ello"
> (by 3 "a12b34c56d")
"abcd"
> (remove "v" "hevvy")
"hey"
> (map + #(1 2 3) #(1 2 3))
'#(2 4 6)
> (trim-if char-whitespace? "   hello there\n\n")
"hello there"
> (trim-if negative? (list -1 2 -1 3 -7))
'(2 -1 3)
> (trim 1 (list 1 1 2 3 5 3 2 1 1))
'(2 3 5 3 2)
> (rotate-right 2 "llohe")
"hello"

This kind of isomorphic behavior is perhaps not the right solution (again, as Alexis hinted at in her talk linked above). If you can return something that functions the same as your expected result (such as a "view" into your original data structure) without actually computing a tangible result, then that is better than producing one in linear time. And if there are different types involved in the input (e.g. (map + '(1 2 3) #(1 2 3))), choosing one arbitrarily for the result once again seems hasty. Nevertheless, this kind of isomorphic behavior is extremely convenient. And in a lot of cases (e.g. tests, documentation, exploratory hacking, ...) convenience is the main criterion. It is also likely that if streams had a better default print representation (as Alexis suggests in the video!) then seq/iso would not be needed. But for now if it is convenience you seek, then just (require seq/iso) for ignorance and bliss, my friend.

Finally, I just want to observe that having general-purpose APIs like this is useful, and having expressive ad-hoc ways to describe sequences such as the for* forms is also great (and they can certainly be used together!). But it feels like there is room for a declarative language here to describe sequence operations, with primitives like by and trim and modifiers like while, at and where, so that these are not distinct APIs but are part of the language. A language more fluent than the for* comprehensions, but less gratuitous than something like Common Lisp's LOOP. Maybe something like Iterate, but generic and functional, and ideally, also isomorphic (or at least with a form like (as vector) to indicate the return type, or employing a return type with a helpful default print representation). That, I feel, would be quite nice.

Maybe something like this could make sense for the new generic collections framework for Rhombus that I believe @notjack is working on. But until such a point, enjoy :slight_smile:

-Sid

P. S. I used the "announcements" category for this post since it is a new library "announcement," but I see that @spdegabrielle had posted about this category a few weeks ago soliciting opinions on what this category should be used for. The other category that seemed relevant is "show and tell" but that also seems quite broad. There was also some discussion some time ago re: a "low volume" announcements category which could have independent visibility rules and customizations. I don't know if this is that category or not. If people have opinions about this, please share on Stephen's thread, and I'll update this post as needed, thanks :pray:

14 Likes

A post was split to a new topic: Discussion of Introducing Seq: A generic, isomorphic sequence library

This looks great :heart:

Could you elaborate a bit more how it's different from Clojure Sequences, and where it's similar, or even equal?

1 Like

I recommend watching that video I linked above which talks about the generic collections library (data/collection), which is the foundation for the present library. data/collection was, I believe, inspired by the Clojure sequence library, and from the link you sent, I can confirm that seq and data/collection are very similar to Clojure -- generic, lazy, and immutable. Beyond that, I'm not sure, but maybe someone else could chime in here since I think data/collection is fairly widely used.

Yay ā€˜moveā€™ worked! (Making up 20 characters)

1 Like

Donā€™t forget to post a link the link to the package announcement on Reddit/r/racket and Twitter

https://racket.discourse.group/t/introducing-seq-a-generic-isomorphic-sequence-library/816

It is also worth posting on the racket discord, and maybe other discords that have racket activity; lisp, scheme, etc.

S.

@countvajhula The contract on nth looks odd: it says the pos should be a natural, but the documentation and examples say it allows negatives. Idem. for set-nth; I noticed a few other things while reading, but I don't remember what they are. I may make a PR if I find time :slight_smile:

Since Discussion of Introducing Seq: A generic, isomorphic sequence library and "Introducing Seq" were closed by @spdegabrielle , I'm opening this up hereā€¦ happy to have it merged in the discussion, but it should be unlockedā€¦

2 Likes

My apologies - it is a comedy of errors today.

2 Likes

No problemā€”thanks for all the help!

3 Likes

Thanks for this. I definitely end up using lists far more often than I should because it's easier than using other data structures. Generic interfaces that work for a large set of collections was also something I remember Dlang doing well when I was playing around with it, so I'm glad to see members of the community also adding this to Racket.

3 Likes

Ah, thanks for catching those. There's probably a lot of stuff in my blind spot since I worked on this package incrementally and sporadically over a long period of time. Would gladly accept a PR :slight_smile:

You're welcome! But just to be clear, the majority of the contribution towards generic collections in Racket comes from the data/collection library which has been around for a number of years. This package is more of an add-on/extension, but with some distinct stylistic choices mentioned in the post, but also this one that I forgot to mention:

  • the use of the generic equality predicate from relation/equivalence to simplify the APIs. This is so that you never need to supply a custom equality predicate, the way you sometimes do with built-in list APIs. Any equality customization can be done just by using a #:key optional argument

Now as it happens, generic collections in Racket have yet to take flight even though we've had data/collection for a while, and I'm not sure what the reasons for this are. Since you have had positive experiences using generic collections in other languages, I think any input you can share here would be valuable -- e.g. if you run into any issues using seq and data/collection, consider creating an issue on the seq repo so that we can try and fix it. Such discussions will probably also help the Rhombus generic collections initiative succeed, since understanding how people actually want to use generic collections can help in designing the best system.

3 Likes