BibTeX language

Hello fellow Racketeers,

as a part of my research, my BibTeX bibliography file started quickly growing both in terms of the number of entries and also in terms of the content included in each entry (for example all the abstracts). I wondered whether there is a tool to work with the BibTeX "database" programmatically and ideally if such tool could expose SQL-like query interface. I found none.

Therefore, I quickly hacked one together. And because the best way to solve any problem is to create a language for solving the problem in question, I created such language.

Meet #lang bibtex - Dominik Joe Pantůček / bibtex-lang · GitLab and also consider looking at the preliminary documentation at BibTeX Language.

It works similarly to how Scribble works. At the beginning of your bibliography.bib file the #lang bibtex line needs to be added. The good thing is that both bibtex and biber (for biblatex) ignore this line. That is no coincidence, the specification clearly states that any processing software must ignore any text on lines not contained in the bibliography entries. With this little one-line change we can now (require "bibliography.bib") and the module provides a single binding called bib for now.

The bibtex/query module provides a nice interface for querying the data.

But the best feature - mainly the feature why I put this all together - is when you run the bibliography as a standalone program, like racket bibliography.bib. Let's see what happens:

> (select id author title)
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃Id       ┃Author                  ┃Title                                                  ┃
┣━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃dh76     β”‚Diffie, W and Hellman,  β”‚β€œNew directions in cryptography,” IEEE Transactions on ┃
┃         β”‚ME                      β”‚Information Theory, vol. IT-22, pp. 644-654            ┃
┠─────────┼────────────────────────┼───────────────────────────────────────────────────────┨
┃elgamal85β”‚Elgamal, T.             β”‚A public key cryptosystem and a signature scheme based ┃
┃         β”‚                        β”‚on discrete logarithms                                 ┃
┠─────────┼────────────────────────┼───────────────────────────────────────────────────────┨
┃rsa78    β”‚Rivest, R. L. and       β”‚A method for obtaining digital signatures and          ┃
┃         β”‚Shamir, A. and Adleman, β”‚public-key cryptosystems                               ┃
┃         β”‚L.                      β”‚                                                       ┃
┗━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
> (select id author title #:where (> year 1980))
┏━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃Id       ┃Author   ┃Title                                                                 ┃
┣━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃elgamal85β”‚Elgamal, β”‚A public key cryptosystem and a signature scheme based on discrete    ┃
┃         β”‚T.       β”‚logarithms                                                            ┃
┗━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
> (select id author title #:where (and (like author "diffie") (< year 1980)))
┏━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃Id  ┃Author           ┃Title                                                              ┃
┣━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃dh76β”‚Diffie, W and    β”‚β€œNew directions in cryptography,” IEEE Transactions on Information ┃
┃    β”‚Hellman, ME      β”‚Theory, vol. IT-22, pp. 644-654                                    ┃
┗━━━━┷━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

The select syntax provides a small DSL just for querying the currently loaded bibliography. And the whole racket/base is available as well - that means you can interactively search your bibliography, store the results as bindings (using define) and work with them later. The REPL uses Expeditor: Terminal Expression Editor and you get almost all the bells and whistles as when you run the normal Racket REPL.

There are some open design questions and the package needs some polishing before I publish it on the package server. I would like to know what other racketeers think and what their preferred design decisions might be.

  1. What should the name of the package be? It started as bib-lang, now I use bibtex-lang as WIP name.
  2. How should the collection be named? Again, it started as bib-lang, but now it turned into bibtex - which allows #lang bibtex to be used. That sounds "natural" however I am concerned whether in the future something more relevant/powerful for BibTeX might show up and the name would already be taken.
  3. As @spdegabrielle pointed out on Discord, perhaps the #lang interface might be a barrier for other BibTeX users (mostly from academia). What do you think about raco bibtex bibliography.bib interface?

And of course, any other thoughts or feature requests are more than welcome :wink:

4 Likes
  1. 40 years too late. (Back then I wrote an awk script to manage citations, no bibtex around.

  2. Some 20 years ago I wrote a Racket script to manage citations for NSF and DARPA proposals.

  3. Then my grad students managed bibs, as if it were 1971 :slight_smile:

  4. By now, I write books with Scribble and rely on Matthew’s bib system.

  5. Adoption
    β€” if this #lang can manage existing bib files from authors, especially recognize and merge such files, you’ve got a true winner
    β€” if this #lang can abstract over the source of bib and clients, you’ve won the bib war.
    β€” Is a #lang an obstacle to adoption in academia? Most PL researchers program only in Latex and Lean (at most), so probably yes :slight_smile:
    β€” Is it possible to hid the #lang with an independent, second interaction language for such people?

A thank-you despite 1 and 2 and 3 :slight_smile: β€” Matthias

1 Like


|
|

  • | - |
  1. How should the collection be named? Again, it started as bib-lang, but now it turned into bibtex - which allows #lang bibtex to be used. That sounds "natural" however I am concerned whether in the future something more relevant/powerful for BibTeX might show up and the name would already be taken.

Despite only rarely doing anything related to bibliographies recently, I'm quite excited about this :slight_smile:

I would not hesitate to use bibtexβ€’there can always later be bibtex-improved (like vi improved). Or your package/collection can grow more features (esp. via other packages that also drop libraries into the bibtex collection).