pdfcat as a Scala script

I am really enjoying scripting in Scala with scala-cli.

Scala has long supported scripting in theory, but for your scripts to run, all the dependencies had to be preinstalled in your CLASSPATH. That rendered the feature not so useful. Under scala-cli, however, dependencies are delightfully automanaged for your script. This has rendered JVM scripting practical and convenient for me.

I find that psychologically this is a BFD. It dissolves the boundary between app and library. Usually, I want to get something done, I look for an app. If I want to merge a bunch of PDFs together, I might fire up Adobe Acrobat, curse myself, and mess around until I figure out how you do that.

However, the Java PDFBox library exists. (itextpdf too!) With easy scripting, the library can substitute for the app. A glance at a tutorial on how to merge PDFs with PDFBox and we were off to the races.

The meaningful code is trivial:

val files = args.map(fn => new File(fn))
val merger = new PDFMergerUtility();
merger.setDestinationFileName(args.last)
files.init.foreach( merger.addSource )
merger.mergeDocuments()

The full script is longer than this, of course, but the rest is sanity-checking the command line and aborting if it isn't right.

In fact, I often find command-line parsing outweighs functional code when I write scripts. Of course that's all optional — you can skip a nice command line if you really mean to script a one-off. But it is great to retain the capacity to solve probems you've already solved instantly on a nice command line. And it's great practice with the Scala ecosystem's rich set of command-line parsing libraries.

A few days ago I needed to quickly serve a directory by HTTP from my laptop. There's some Python command I've used for that in the past. I'd have to look it up, and figure out how to get it to bind to the laptop's public interface rather than localhost. It was quick instead to script up Li Haoyi's cask library, and build a nice command line with decline. Check out http-serve.

05:00 PM EDT