I started a new contracting job a few months ago writing Clojure with the system of record being Datomic. We deploy code using datomic ions.
This post is a collection of learnings I’ve picked up about using this system covering surprises, useful, non-obvious findings, and stumbling blocks I ran into.
I found this by chance in one of the datomic testing videos (linked in the testing section).
I was very surpised this isn’t listed on the datomic site.
http://www.learndatalogtoday.org/
When you execute your ions code via a REPL there are namespaces already required (and those during dev you have required) which can mask certain compilation errors that will happen when your code is run when deployed. Instead of waiting for a deploy and tracing down bugs via cloudwatch logs, you can compile your code locally to confirm compilation works:
classes
{:paths ["classes"]}
clj -Adev -e "(compile 'my-ns)"
If you have compile-time errors, or namespaces that are not required explicitly in an ns form, step 3 will inform you.
Make sure you remove classes
when going to deploy - see the next note.
https://clojure.org/guides/deps_and_cli#aot_compilation
If you do compile ahead of time to test that compilation works, make sure that you remove “classes” from your :paths
and rm -rf classes
before deploying. We ran into a very annoying issue where stale code was getting deployed
from classes
instead of from our src
directory.
As a rule of thumb if a deploy takes longer than 2 or 3 minutes, the deploy will fail. Successful deploys seem to happen quickly (< 4 minutes), whereas deployments that will eventually fail will by in a Running state for ~10 minutes before finally failing.
I found this set of cloudwatch filters helpful for dealing with errors at different points of app development:
{$.Type="Alert"}
{$.Msg="LoadIonsFailed"}
{$.Msg="IonLambdaException"}
Using the aws CLI helps as well. I added script to invoke cloudwatch using common parameters.
Time fields come back as milliseconds since the UNIX epoch, the jq script updates them to human meaningful ISO 8601.
aws logs filter-log-events \
--log-group-name "$group_name" \
--log-stream-name-prefix "$stream_name" \
--max-items "$max_items" \
--filter-pattern "$pattern" \
--start-time $start_time \
| \
jq '.events
| map({
logStreamName,
timestamp: (.timestamp / 1000) | localtime | strflocaltime("%FT%T"),
message: .message | fromjson |
(.Timestamp = (.Timestamp / 1000 | localtime | strflocaltime("%FT%T")))
})
'
# Time ($start_time above) has to be milliseconds since the UNIX epoch:
# date +%s - number of seconds since the epoch
# convert hours/minutes into milliseconds since epoch
if [ -z "$hours_ago" ]; then
start_time=$(( $(date +%s) * 1000 - (60 * minutes_ago * 1000 )))
else
start_time=$(( $(date +%s) * 1000 - (60 * 60 * 1000 * hours_ago )))
fi
TLDR, use this repo:
https://github.com/ComputeSoftware/datomic-client-memdb
It implements the datomic.client.api in-memory.
I was surpised to find that there is no testing information on the datomic site. (I couldn’t find any at least). A kind stranger on the clojurians slack mentioned the above project, which I found works well for in-memory testing.
Example usage in a test:
(ns my-ns
(:require [compute.datomic-client-memdb.core :as memdb]
[datomic.client.api :as d]
[my-project.schema :as schema]))
(def ^:dynamic *client* nil)
(def test-db "test")
(defn with-client
[f]
(with-open [c (memdb/client {})]
(binding [*client* c]
(d/delete-database *client* {:db-name test-db})
(d/create-database *client* {:db-name test-db})
(let [conn (d/connect *client* {:db-name test-db})]
(schema/transact-schema-and-data conn))
(f))))
(use-fixtures :each with-client)
;; in a test:
;; (let [conn (d/connect *client* {:db-name test-db})] ...)
These two videos were very useful for the general overview of testing with Datomic:
Test-driven Development with Datomic
Datomic: up and running by Misophistful
These boths use com.datomic/datomic-free
for an in-memory DB, which doesn’t work with the datomic.client api.
The DB produced by datomic.api
is not compatible with the datomic.client.api
.
When you browse to the docs page, you are greeted with two doors to choose from:
A naive visitor may assume that if you are using “cloud” you will not need to peruse “on-prem”. This is a false assumption, and there is much useful information in the on-prem docs which are not replicated in the cloud section. For example the “Reference” section of the on-prem docs is filled with lots of useful information which you will miss entirely if you only browse the cloud docs. The layout of the two sections is similar and some of the info is replicated (Reference -> ACID in on prem, and Transactions -> ACID in cloud, for example). From a user experience perspective it should be made clearer what the differences are, or maybe a third section should be created covering info related to datomic in general and the on-prem and cloud sections only covering info unique to them.
A more practical difference between the two styles of running datomic is that the query grammar support
is different between them.
For example, returning a “single scalar” via '[:find ?field . :where []]
is not supported in cloud.
I didn’t realize this until getting an exception when running a query.
https://docs.datomic.com/on-prem/query.html#find-specifications
vs
https://docs.datomic.com/cloud/query/query-data-reference.html#find-specs
You can return a collection of maps from datomic:
https://docs.datomic.com/on-prem/query.html#return-maps
[:find ?artist-name ?release-name
:keys artist release
:where [?release :release/name ?release-name]
[?release :release/artists ?artist]
[?artist :artist/name ?artist-name]]
I didn’t see this used in any tutorials, so it came as a surprise when I happened upon it in the documentation. We had repetitive code to convert vectors to maps before realzing you could use this.
You can use pull
multiple times, as well as including pull
with find-elem clauses:
[:find ?field-one ?field-two (pull ?e [:entity/attr1 :entity/attr2])
(pull ?e2 [:entity2/attr1])]
;; And you can combine this with :keys
[:find ?field-one ?field-two (pull ?e [:entity/attr1 :entity/attr2])
(pull ?e2 [:entity2/attr1])
:keys field-one field-two my-entity my-entity2]
But I found support for :keys
only worked if the pull expressions came at the end of the find spec.
In local REPL development and when running tests, cast
was throwing a mysterious
error. After much wasted time, we tracked down the solution was to
invoke initialize-redirect
before executing any code that used a cast:
(require '[datomic.ion.cast :as cast])
(cast/initialize-redirect :stdout)
;;; Rest of your code here
If you use org.clojure/data.xml in a project, you may get a collision with ion-dev, we had to exclude it in deps.edn:
{com.datomc/ion-dev {:mvn/version "0.9.234" :exclusions [org.clojure/data.xml]}}
I didn’t see this mentioned in any tutorials or expounded upon in the docs.
If you are using Amazon SSM to keep DB connection configuration info, you’ll use
ion/get-en
and ion/get-app-info
(listed here https://docs.datomic.com/cloud/ions/ions-reference.html#ion-parameters)
to retrieve information about your CloudFormation stack in order to
construct a connection map. When you execute these outside of the AWS environment (in a local REPL) they return
nil.
The docs include this line:
When running outside Datomic Cloud, get-env returns the value of the
DATOMIC_ENV_MAP
environment variable, read as edn.
and for get-app-info
:
When running outside Datomic Cloud, get-app-info returns the value of the
DATOMIC_APP_INFO_MAP
environment variable, read as edn.
So it is up to you to set these properly in order to connect to your system outside of a deployment.