YAMLScript Spring Update

by Ingy döt Net | | 6 min read

It's been a while since I let you know what's been happening with YAMLScript. I've been busy working on it every day this year and I have a lot to tell you about!

YAMLScript Activity in 2024

Let me start by telling you about some of the events that have happened in the YAMLScript world recently.

  • Seajure Talk - I gave a talk at the Seajure (Seattle Clojure) Meetup in March.
  • YAMLScript Article - The New Stack published an article about YAMLScript in March
  • YAMLScript Podcast - I was interviewed on the "The REPL" by Daniel Compton in April.

Finally I'm presenting a talk about YAMLScript at the Open Source Summit North America this Thursday, April 18th. Super excited about that!

YAMLScript Progress Since the Advent Series

I blogged about YAMLScript every day in December 2023. That was something. We were madly trying to implement all the stuff I was talking about. I may have told a few small lies along the way, but I'm happy to say that everything I talked about is now implemented and working.

And of course we've added a lot more since then.

Let's talk about some of the highlights.

More YAMLScript Language Binding Libraries

We've added 4 new binding libraries for YAMLScript in 2024, bringing the total to 8: Clojure (new), Java (new), NodeJS (new), Perl, Python, Raku, Ruby (new) and Rust.

The idea is to eventually have binding libraries for every language where YAML is currently used. I expect more to come soon. If you see a missing language that you want (and you are handy with FFI), please consider writing a binding library for it and submitting a PR to the YAMLScript Mono Repository.

All the libraries are a small amount of code and are easy to write.

If you need help, please stop by the YAMLScript Matrix Chat Room and we'll get you what you need.

New Dot Chaining Operator

In Clojure you can write:

(take 5
(shuffle
(filter even?
(range 100))))

Evaluates to something like: [94 36 4 70 74]

Often people like to use Clojure's threading macro (->) to make this more readable:

(->> (range 100)
(filter even?)
(shuffle)
(take 5))

Let's look at the same thing in YAMLScript in various styles.

Basic block style:

take 5:
shuffle:
filter even?:
range: 100

Yes Expression style:

take(5
shuffle(
filter(even?
range(100))))

Clojure threading style:

->> range(100):
filter(even?)
shuffle
take(5)

And now with the new dot chaining operator:

range(100)
.filter(even? _)
.shuffle()
.take(5 _)

Or all on one line:

$ ys -e 'say: range(100).filter(even? _).shuffle().take(5 _)'

More about the Dot Chaining Operator

Above we used the dot operator to chain function calls together. But it's even more useful than that.

Consider:

a.b.3.c().d(e).f(g _).$h

If a is a mapping, the "b" (or :b or 'b) key is looked up.
3 is the 4th element of the list that b resolves to.
c is called as a function with the result of 3 as an argument.
d is called as a function with the result of c and e as arguments.
f is called as a function with g and the result of d as arguments.
Finally the key in variable h is looked up in the result of f.

One common idiom is looking up environment variables. For example: ENV.HOME or ENV.USER.str/upper-case().

New Operators and Syntax

In the last section we saw the new chaining operator: ..

The interesting thing to note here is that . already had a meaning in Clojure. It's the namespace separator, as in clojure.core/println.

YAMLScript uses :: for that separator instead, so we'd say clojure::core/println.

In a similar switch-up, we added the % and %% operators. (a % b) compiles to the Clojure code (rem a b), and (a %% b) compiles to (mod a b); 2 slightly different math functions.

In Clojure, % was already a shorthand for the %1 argument in anonymous functions.

In YAMLScript, you'll need to use %1 for that:

square =: \(%1 * %1)

Next we added ** for exponentiation. So 2 ** 3 compiles to (pow 2 3). We also added pow to the ys::std YAMLScript standard library, so you can say pow(2 3) instead of Math/pow(2 3). More about the standard library in a bit.

We added the Perl style regex operator =~ and regex literal /.../.

if (peanut-butter =~ /chocolate/):
say: "You've got chocolate in my peanut butter!"

You might have noticed that YAMLScript uses \( ... ) for anonymous functions, where Clojure uses #( ... ). Things starting with # are comments in YAML, so that's problematic.

We decided the \ would be a general purpose escape character, but Clojure already uses \ for escaping character literals.

YAMLScript now uses \\ for that purpose: str(\\a \\b \\c) would evaluate to "abc".

In Clojure you see 'foo used extensively for quoting symbols. In YAML, single quotes are used for string literals, and I felt it was important to keep that distinction.

Luckily in Clojure you can use (quote foo) for the same thing. In YAMLScript you can use quote(foo). We added the shorthand q(foo) as well as \'foo for quoting.

Finally, we added foo* splatting.

It is common in Clojure to use apply to call a function with a list of arguments: (apply f [1 2 3]) is the same as (f 1 2 3). So if xs is a sequence of numbers, you'd say (apply f xs).

In YAMLScript you can say f(xs*) instead.

But it gets better. You can you use xs* anywhere in a list of arguments: f(1 xs* 3 ys* 5).

Standard Global Variables

Clojure has dynamic variables like *out* and *command-line-args*.

YAMLScript now has a few of these, but we decided to use symbols with ALL-CAPS instead of *earmuffs* for these. We also made them shorter in some cases.

  • ARGV - Command line arguments
  • ARGS - Like ARGV but numbers are converted to numeric values
  • ENV - Environment variable mapping
  • CWD - Current working directory
  • FILE - Path to the current file being processed
  • INC - The YAMLScript module include path
  • VERSIONS - Mapping of versions of key components in YAMLScript

We'll be adding more of these as needed.

The ys::std YAMLScript Standard Library

We added many libraries that are automatically available in every YAMLScript program:

  • ys::std - The standard library (also available std)
  • clojure::str - available as str
  • clojure::math - available as math
  • clojure::tools::cli - available as cli
  • babashka::fs - avaiable as fs
  • babashka::http - available as http

and more.

The ys::std library is the most important one. We can see all the functions available in it by running:

$ ys -e 'say: q(std).ns-publics().keys().sort().vec()' | zprint '{:width 40}'
[$$ *_ +++ +++* +_ =-- _& _* _** _T __
_dot abspath curl cwd die dirname each
err exec join new num omap out pow pp
pretty print process q rng say sh shell
sleep throw toBool toFloat toInt toMap
toStr use-pod warn www xxx yyy zzz]

You might note that the print function is part of clojure::core and YAMLScript offers all of that by default. In a few places we decided to replace Clojure functions with our own versions. But we also added a ys::clj library that has all the Clojure functions in it.

So if you really need clojure::core/print, you can say clj/print.

Here are the functions in ys::clj:

$ ys -e 'say: q(clj).ns-publics().keys().sort().vec()' | zprint '{:width 40}'
[compile load load-file num print use]

Not too many.

We'll look at use in the next section.

Including YAMLScript Libraries and Modules

I'll start by saying we made the require function nice to use in YAMLScript.

The best way to describe it is to show you the actual test case for it:

- name: Various require transformations in one pair
yamlscript: |
!yamlscript/v0
require:
foo::aaa: => fa
foo::bbb: one
foo::ccc: one two
foo::ddd:
=> fd
one two
foo::eee:
foo::fff: :all

clojure: |
(require
'[foo.aaa :as fa]
'[foo.bbb :refer [one]]
'[foo.ccc :refer [one two]]
'[foo.ddd :as fd :refer [one two]]
'foo.eee
'[foo.fff :refer :all])

That's how we write tests for the YAMLScript compiler.

I think it explains how require works in YAMLScript pretty well.

The require function is used for including Clojure libraries. We've also added support for writing libraries in YAMLScript itself. We'll call them "modules" to distinguish them from Clojure libraries.

To use a module in YAMLScript, you use the use function.

use: 'my-module'

This will look for a file called my-module.ys in the INC path.

The INC path is a list of directories that YAMLScript will look in for and it defaults to the current directory. You can override the INC path by setting the YSPATH environment variable.

YAMLScript also added support for Babashka Pods. You can use a pod in YAMLScript like this:

use-pod: "org.babashka/go-sqlite3" "0.1.0"

Multi-doc and Anchor / Alias Support

This is the last topic for today, but it's a big one.

Anchors and aliases are an important feature of YAML. They let you mark a node with a name and then refer to that node by name later.

Until now, YAMLScript has not supported anchors and aliases. Supporting them is critical because YAMLScript should be able to load all existing YAML config files, and config files often use anchors and aliases.

Well not only did we add support, we took them to the next level!

YAML has the concept of multiple documents in a single file. Unfortunately, this is not very useful in the real world; at least not for config files. One problem is that YAML doesn't allow you to make an anchor in one document and use an alias to it in another document.

YAMLScript makes great use of multi-doc combined with anchors and aliases.

Here's an example YAMLScript file file.ys that is part of my upcoming talk at Open Source Summit North America on Thursday:

--- !yamlscript/v0/

- &data !
yaml/load: slurp('data.yaml')

- &map1
key: value

- &seq1
- one
- two

- &dogs !
yaml/load:
curl:: https://yamlscript.org/dogs.yaml

--- !yamlscript/v0/

some: plain data string
number: 42

sentence::
"$(*data.name) likes
$(*data.langs.rand-nth())!!!"
2 dogs:: .*dogs.big.shuffle().take(2 _)

sequence: !concat*:
- *seq1
- [three, four]
mapping: !merge*:
- this: that
- *map1

The data.yaml file that gets loaded looks like this:

name: Ingy
langs:
- Bash
- Clojure
- CoffeeScript
- Perl
- YAMLScript

When you run ys -Y file.ys you get:

some: plain data string
number: 42
sentence: Ingy likes CoffeeScript!!!
2 dogs:
- Saint Bernard
- Great Dane
sequence:
- one
- two
- three
- four
mapping:
this: that
key: value

I'll let you figure out what's happening here. It should be fairly obvious and I think it's all pretty cool.

Moving Forward

That was a lot of information, I know. I haven't really found the time to blog about these changes on a regular basis.

Hopefully I'll be able to do that more in the future.

Cheers!