Matter and Void

Miscellanea of learning datomic ions

January 20, 2020tech_writing

I started a new contracting job a few months ago writing Clojure with the system of record being Datomic. We deploy code using datomic ions.

This post is a collection of learnings I’ve picked up about using this system covering surprises, useful, non-obvious findings, and stumbling blocks I ran into.

Learning datalog

I found this by chance in one of the datomic testing videos (linked in the testing section).

I was very surpised this isn’t listed on the datomic site.

http://www.learndatalogtoday.org/

Compile ions locally.

When you execute your ions code via a REPL there are namespaces already required (and those during dev you have required) which can mask certain compilation errors that will happen when your code is run when deployed. Instead of waiting for a deploy and tracing down bugs via cloudwatch logs, you can compile your code locally to confirm compilation works:

  1. mkdir classes
  2. appends classes to deps.edn: {:paths ["classes"]}
  3. clj -Adev -e "(compile 'my-ns)"

If you have compile-time errors, or namespaces that are not required explicitly in an ns form, step 3 will inform you.

Make sure you remove classes when going to deploy - see the next note.

https://clojure.org/guides/deps_and_cli#aot_compilation

Deploy issues

If you do compile ahead of time to test that compilation works, make sure that you remove “classes” from your :paths and rm -rf classes before deploying. We ran into a very annoying issue where stale code was getting deployed from classes instead of from our src directory.

Deploys that will fail take longer to complete

As a rule of thumb if a deploy takes longer than 2 or 3 minutes, the deploy will fail. Successful deploys seem to happen quickly (< 4 minutes), whereas deployments that will eventually fail will by in a Running state for ~10 minutes before finally failing.

Debugging with CloudWatch

I found this set of cloudwatch filters helpful for dealing with errors at different points of app development:

  • Find any general errors:
    {$.Type="Alert"}
  • Find deployment failures:
    {$.Msg="LoadIonsFailed"}
  • Find runtime errors, for example when hitting an ion endpoint via HTTPS
    {$.Msg="IonLambdaException"}

Using the aws CLI helps as well. I added script to invoke cloudwatch using common parameters.

Time fields come back as milliseconds since the UNIX epoch, the jq script updates them to human meaningful ISO 8601.

 aws logs filter-log-events \
    --log-group-name "$group_name" \
    --log-stream-name-prefix "$stream_name" \
    --max-items "$max_items" \
    --filter-pattern "$pattern" \
    --start-time $start_time \
    | \
    jq '.events
    | map({
       logStreamName,
       timestamp: (.timestamp / 1000) | localtime | strflocaltime("%FT%T"),
       message: .message | fromjson |
               (.Timestamp = (.Timestamp / 1000 | localtime | strflocaltime("%FT%T")))
      })
    '
# Time ($start_time above) has to be milliseconds since the UNIX epoch:
# date +%s - number of seconds since the epoch
# convert hours/minutes into milliseconds since epoch
if [ -z "$hours_ago" ]; then
  start_time=$(( $(date +%s) * 1000 - (60 * minutes_ago * 1000 )))
else
  start_time=$(( $(date +%s) * 1000 - (60 * 60 * 1000 * hours_ago )))
fi

Testing with an in-memory DB

TLDR, use this repo:

https://github.com/ComputeSoftware/datomic-client-memdb

It implements the datomic.client.api in-memory.

I was surpised to find that there is no testing information on the datomic site. (I couldn’t find any at least). A kind stranger on the clojurians slack mentioned the above project, which I found works well for in-memory testing.

Example usage in a test:

(ns my-ns
 (:require [compute.datomic-client-memdb.core :as memdb]
           [datomic.client.api :as d]
           [my-project.schema :as schema]))

(def ^:dynamic *client* nil)
(def test-db "test")

(defn with-client
  [f]
  (with-open [c (memdb/client {})]
    (binding [*client* c]
      (d/delete-database *client* {:db-name test-db})
      (d/create-database *client* {:db-name test-db})
      (let [conn (d/connect *client* {:db-name test-db})]
        (schema/transact-schema-and-data conn))
      (f))))

(use-fixtures :each with-client)

;; in a test:
;; (let [conn (d/connect *client* {:db-name test-db})] ...)

These two videos were very useful for the general overview of testing with Datomic:

Test-driven Development with Datomic

Datomic: up and running by Misophistful

These boths use com.datomic/datomic-free for an in-memory DB, which doesn’t work with the datomic.client api.

The DB produced by datomic.api is not compatible with the datomic.client.api.

on-prem vs cloud docs

When you browse to the docs page, you are greeted with two doors to choose from:

https://docs.datomic.com/

A naive visitor may assume that if you are using “cloud” you will not need to peruse “on-prem”. This is a false assumption, and there is much useful information in the on-prem docs which are not replicated in the cloud section. For example the “Reference” section of the on-prem docs is filled with lots of useful information which you will miss entirely if you only browse the cloud docs. The layout of the two sections is similar and some of the info is replicated (Reference -> ACID in on prem, and Transactions -> ACID in cloud, for example). From a user experience perspective it should be made clearer what the differences are, or maybe a third section should be created covering info related to datomic in general and the on-prem and cloud sections only covering info unique to them.

A more practical difference between the two styles of running datomic is that the query grammar support is different between them. For example, returning a “single scalar” via '[:find ?field . :where []] is not supported in cloud.

I didn’t realize this until getting an exception when running a query.

https://docs.datomic.com/on-prem/query.html#find-specifications

vs

https://docs.datomic.com/cloud/query/query-data-reference.html#find-specs

Query notes

Return a collection of maps

You can return a collection of maps from datomic:

https://docs.datomic.com/on-prem/query.html#return-maps

[:find ?artist-name ?release-name
 :keys artist release
 :where [?release :release/name ?release-name]
 [?release :release/artists ?artist]
 [?artist :artist/name ?artist-name]]

I didn’t see this used in any tutorials, so it came as a surprise when I happened upon it in the documentation. We had repetitive code to convert vectors to maps before realzing you could use this.

Pull multiple entities + find

You can use pull multiple times, as well as including pull with find-elem clauses:

[:find ?field-one ?field-two (pull ?e [:entity/attr1 :entity/attr2])
                             (pull ?e2 [:entity2/attr1])]
;;  And you can combine this with :keys
[:find ?field-one ?field-two (pull ?e [:entity/attr1 :entity/attr2])
                             (pull ?e2 [:entity2/attr1])
 :keys field-one field-two my-entity my-entity2]

But I found support for :keys only worked if the pull expressions came at the end of the find spec.

Local dev + testing and datomic.ions.cast

In local REPL development and when running tests, cast was throwing a mysterious error. After much wasted time, we tracked down the solution was to invoke initialize-redirect before executing any code that used a cast:

(require '[datomic.ion.cast :as cast])
(cast/initialize-redirect :stdout)
;;; Rest of your code here

com.datomic/ion-dev depends on old org.clojure/data.xml

If you use org.clojure/data.xml in a project, you may get a collision with ion-dev, we had to exclude it in deps.edn:

{com.datomc/ion-dev {:mvn/version "0.9.234" :exclusions [org.clojure/data.xml]}}

Executing ions helpers locally

I didn’t see this mentioned in any tutorials or expounded upon in the docs. If you are using Amazon SSM to keep DB connection configuration info, you’ll use ion/get-en and ion/get-app-info (listed here https://docs.datomic.com/cloud/ions/ions-reference.html#ion-parameters) to retrieve information about your CloudFormation stack in order to construct a connection map. When you execute these outside of the AWS environment (in a local REPL) they return nil.

The docs include this line:

When running outside Datomic Cloud, get-env returns the value of the DATOMIC_ENV_MAP environment variable, read as edn.

and for get-app-info:

When running outside Datomic Cloud, get-app-info returns the value of the DATOMIC_APP_INFO_MAP environment variable, read as edn.

So it is up to you to set these properly in order to connect to your system outside of a deployment.


Dan Vingo
This is the personal site of Dan VingoAbout