Recompose v1.1; now with missing data handling!

🔗 Sunday, 12 May 2024

By Cariad Eccleston • 👋 @cariad@indiepocalypse.social • ✉️ cariad@cariad.earth

I got ninety-nine problems, but a broken path ain’t one.

Recompose v1.0 schema example

Last July, I released Recompose: a Python package for recomposing data by following instructional schemas.

Eh?

Let me explain

Let’s say, for example, you’ve got this data set:

{
  "2023-06-04": {
    "groups": {
      "chefs": [
        {
          "name": "Alice"
        },
        {
          "name": "Bob"
        }
      ],
      "firefighters": [
        {
          "name": "Daniel"
        },
        {
          "name": "Esther"
        }
      ],
      "zookeepers": [
        {
          "name": "Gregory"
        },
        {
          "name": "Harold"
        }
      ]
    }
  },
  "2023-06-05": {
    "groups": {
      "chefs": [
        {
          "namQe": "Jet"
        },
        {
          "name": "Karen"
        }
      ],
      "firefighters": [
        {
          "name": "Mater"
        },
        {
          "name": "Nigel"
        }
      ],
      "zookeepers": [
        {
          "name": "Peter"
        },
        {
          "name": "Quentin"
        }
      ]
    }
  }
}

…and let’s say you want to reduce those two “firefighters” arrays down to just their first values.

(It’s a niche problem, I grant you–but it’s one that trips me up on the regular.)

With Recompose, you can apply that transformation via this schema:

{
    "version": 1,
    "on": "each-value",
    "perform": {
        "path": "groups",
        "cursor": {
            "perform": {
                "path": "firefighters",
                "cursor": {
                    "perform": "list-to-object"
                }
            }
        }
    }
}

In a nutshell, this directs Recompose to:

  1. Run a cursor over every root value ("on": "each-value"), and within that:
  2. Run a cursor over each child record’s “groups” key ("path": "groups"), and within that:
  3. Perform the “list-to-object” transformation ("perform": "list-to-object") on each child record’s “firefighters” key ("path": "firefighters").

The schema is applied and data recomposed with this code:

from recompose import transform

transformed = transform(
    schema,
    data,
)

…and the result is this data set:

{
  "2023-06-04": {
    "groups": {
      "chefs": [
        {
          "name": "Alice"
        },
        {
          "name": "Bob"
        }
      ],
      "firefighters": {
        "name": "Daniel"
      },
      "zookeepers": [
        {
          "name": "Gregory"
        },
        {
          "name": "Harold"
        }
      ]
    }
  },
  "2023-06-05": {
    "groups": {
      "chefs": [
        {
          "name": "Jet"
        },
        {
          "name": "Karen"
        }
      ],
      "firefighters": {
        "name": "Mater"
      },
      "zookeepers": [
        {
          "name": "Peter"
        },
        {
          "name": "Quentin"
        }
      ]
    }
  }
}

But…

But what happens if the schema describes a path that doesn’t exist?

Recompose v1.0 requires paths to exist, and bones out with a KeyError if they don’t.

Recompose v1.1, though–which I just released–has a new Options class so you can decide for yourself whether Recompose skips that transformation or fails:

from recompose import Allow, Options, transform

options = Options(
    missing_data=Allow.ALLOW,
)

transformed = transform(
    schema,
    data,
    options=options,
)

And that’s the lot!

Recompose v1.1

Recompose v1.1 requires Python 3.9 or later and can be installed from PyPI at your leisure:

pip install recompose

The code is released under the MIT licence at github.com/cariad/recompose, too, and documented at cariad.github.io/recompose.

Have fun!