-
Пять лет блогу
То, что я давненько ничего не писал, связано с тем, что залип в конкурсе Highload Cup. Две недели ни о чем другом думать не мог, все пытался улучшить решение. И тут вспомнил, что у блога юбилей: первая запись была сделана пять лет назад, 8 августа 2012 года.
Речь идет о наивной заметке под названием “Рассылка смс в Питоне”. Там я популярно рассказываю, как слать смс через апишку уже несуществующего сервиса. В те дни я работал в читинском Энергосбыте, писал на 1С, Дельфях и Питоне, а лиспы и ФП были влажными мечтами.
Пять лет по меркам интернета все-таки срок. Много ребят, которых я читал, со временем забили на блоги, не продлили домены. А мне нравится: блог стал настоящим хобби, которое, надеюсь, продлится всю жизнь.
За прошедшее время я написал 320 постов. Не так много, но я не гонюсь за количеством. Некоторые заметки, первоначально задуманные как пара абацев, разрастались до нескольких экранов.
В один момент я поймал себя на мысли, что злоупотребляю низкими приемами: жалуюсь на сервисы, высмеиваю тех, кто не согласен со мной. Это работает, но не в том направлении, в котором бы мне хотелось. Теперь стараюсь писать только на важные темы: программирование, образование, карьера, книги.
Было неприятно видеть, как любимые мной блоги вырождались в комментарии политических новостей. Эту тему я тоже решил пресечь.
Решил двигаться в сторону англоязычной аудитории. В России на Кложе пишут три с половиной анонимуса, поэтому нет смысла писать о ней на русском.
Технические заметки на английском порой дают сильный отклик. После публикации статьи о миграции с Постргреса на Датомик в первый же день ко мне зашли 9000 янглоязычных посетителей.
После нескольких удачных статей про Кложу меня добавили в агрегатор Планета Кложи. Если в посте будет слово “Кложа” на английском, статья попадет в агрегатор. Поэтому если я не хочу, чтобы такое случилось, пишу на русском.
В планах сделать на сайте теги, рубрикацию и прочие ништяки. Благо, блог статичный на Руби-движке, исходники на Гитхабе, и я знаю, у кого можно накопипастить. А пока вот просто список постов, заслуживающих внимания:
-
Conceptual languages
This is a short note on what I’m thinking about programming languages.
Briefly, I reckon some languages as being conceptual. In a conceptual language, every part of it obeys some global idea. This idea forms design of a language, the way it behaves and develops. If affects libraries, community and ecosystem.
Take Lisp, for example. The idea is, when we store our code as data, it brings incredible possibilities to process code as data or turn the data into code. No other language may offer something like that, only Lisp does.
In Erlang, every task is solved within a cascade of lightweight processes named actors. Two actors may communicate directly being run on different servers. Actors do not create Unix threads. They are managed by OTP rather than operation system.
With Haskell, you’ve got quite strong and flexible type system. Types are the most important part of a typical Haskell program. Once it compiles, it will work for sure.
Clojure, a modern Lisp dialect, provides immutable data structures and powerful abstractions for concurrency.
Respectively, non-conceptual languages do not have such a global approach. They try to implement as many features as possible to satisfy every domain: OOP, anonymous functions (lambdas), lazy evaluation, etc. As a result, we’ve got everything but nothing: each part of such a language is not as powerful as its analogies form those ones I mentioned above.
Take Python, for example. Although it has such basic functional blocks as map, reduce and lambdas, programming with it in a functional way would be a mess.
Every part of classical Javascript is just ugly.
Java, a language with static type system, allows you to pass Null instead of any object and end up with NPE exception.
Although conceptual languages are not perfect, they seem to be easier for me to learn because they have some common rules that could not be broken. Say, in Haskell, you just do not have plain Null value. In Clojure, you cannot modify a dictionary, and so on.
They cannot be substituted with other languages. Really, how can you substitute Lisp or Erlang? There aren’t any alternatives for them.
I believe, the future is about conceptual languages. To develop AI or distributed systems, we need something more sensible than yet another language with classes and syntactic sugar.
I’m not sure it could be Lisp, but something that borrows most of its features.
-
Weekly links #28
-
REST APIs are REST-in-Peace APIs. Long Live GraphQL.
Well, although the title is provoking a bit, the main idea is true. We really need something more suitable than REST. It has been great for the last decade, but now it’s time to move on.
-
Great article on XML, trees and zippers in Clojure. For a long time, I was considering parsing XML as a sort of black magic. With Clojure, the process is transparent and easy to understand.
-
Students are Better Off without a Laptop in the Classroom
Thoughts on using laptops by students in a classroom. Again, the profit of IT industry and “smart” devices is overestimated. As I mentioned it earlier, books and people are the only way to learn effectively.
-
Comparing Haskell to Clojure: syntax, types, code snippets, etc.
-
-
Weekly links #27
I’ve just returned from EuroClojure, it was amazing! Thank you everybody who brought it to us.
Here are some interesting links I found on Medium:
-
The Lisp approach to AI (Part 1)
Some thoughts on role of Lisp in discovering AI systems.
-
Show me a business problem and I’ll do my best to avoid it
Jason Fried, a founder of Basecamp, shares his notes on dealing with problems.
-
A Pythonist finds a new home at Clojure land
The title shares the idea as well.
-
Why I chose ClojureScript over JavaScript
So do I.
-
-
Берлин. Городские детали
Побывал на Евро-Кложе в Берлине и хочу рассказать про город. Я не ахти какой путешественник: езжу мало, не особо наблюдателен. Читая Лебедева или Варламова, я долгое время не понимал, зачем они без конца пишут про урны и тротуары. Подумаешь, плитка лучше – и у нас можно пройти!
А в этот раз меня проперло на городскую среду. Гулял по городу и офигевал с того, как классно устроен город. Еще больше я недоумевал, как раньше не понимал разницы между плохой городской средой и хорошей. Видимо, после тридцати все-таки проснулась черточка.
Прежде всего, великолепно уложены тротуары. Это может быть плитка, массивные плиты, булыжник или мелкие камешки, но идти по ним и любоваться паттерном можно бесконечно. Все детали пригнаны как в лего, все по паттерну. Когда плитка не квадратная (или если ее кладут под углом), крайние плитки не режут, а изготавливают специальной формы, чтобы стыковаться с соседними поверхностями.
Далее, нигде нет открытой земли. О вреде открытых клумб те же Варламов с Лебедевым писали сто раз. Любой клочок земли, необходимый растениям, во-первых, утоплен ниже уровня тротуара, а во-вторых, накрыт сеткой или замурован галькой. У нас же с тротуаров и клумб круглый год сыпется грязь и песок, а дворники соскребают обратно на клумбу.
Шокировало, что нет борюров. Тротуар полностью бесшовный. Меняется размер и паттерн плитки, булыжник чередуется с плитами, но уровень нигде не перепадает. На дорожных переходах, в выездах с прилегающих территорий, рядом с подъездами, словом – нет бордюров. Я прошел в одном направлении восемь кварталов и не встретил ни одного на пути.
Все дома вровень с землей. Я не знаю, в чем загадка, но в России не бывает дверей без крыльца. Вход в любой магазин, офис или подъезд начинается со ступеней. А рядом пандус с углом в 60 градусов. Почему нельзя вырыть котлован еще на метр глубже? В каких стандартах прописано, что дверь должна быть в метре от земли?
Это бизнес-центр:
А это жилой подъезд:
Если не ясно, в чем вред от бордюров, одолжите инвалидную коляску. Сядьте в нее дома и попробуйте добраться до магазина. Или в обычную коляску уложите ребенка, в нижнюю корзину продукты и вперед через подземный переход, я на вас посмотрю.
На фотографии ниже видно, что порой действительно случается перепад между тротуаром и зданием. Но оцените, как плавно сделали переход! Высота в 20 сантиметров приходится на длину в 15 метров, угол градусов пять. И как красиво получилось.
Это тот случай, когда инвалид действительно сможет заехать на коляске, а человек с больным сердцем – подняться без учащения пульса.
Теперь урны. То ли меня Лебедев укусил, то ли что, но в каждой урне кроется сто важных мелочей, теперь я это понял. В районе, где я жил, урны одинаковы, имеют те же габариты и цвет. Срабатывает визуальное распознавание: чтобы найти урну, достаточно окинуть взором пространство, и где-то обязательно мелькнет оранжевое пятно. В каждой точке видна хотя бы одна урна. Ни разу не было так, чтобы мне понадобилось, а вокруг нет.
Далее, урна не стоит на земле, а закреплена на столбе на уровне метра. Это позволит увидеть урону на большом расстоянии или в толпе. Отверстие не как у ведра, а меньше и под углом. Здесь тоже умыслел: в такое отверстие не затечет дождь, невозможно заткнуть урну пакетом мусора, как у нас порой делают ленивые уроды.
В такую урну невозможно что-то кидать, нужно подойти и положить. Это резко снижает желание швырять окурки или бутылки ради понтов. У нас же как: офисный планктон курит на высоком крыльце, где-то внизу ведро. Каждый бросает окурок, это такой вид спорта. Вокруг урны Хи-квадрат-распределение бычков и плевков.
А попробуй кинь, когда урна на уровне груди. Можно попасть в кого-то и нарваться на неприятности.
Урны очень аккуратны и чисты, покрашены в оранжевый. В первый раз я принял их за почтовые ящики. Урны никогда не стоят рядом с лавочками, их всегда отделяет метров десять минимум. Пример с автобусной остановкой:
Странные люди эти немцы, почему не хотят сидеть рядом с урной, как принято у нас? Чтобы справа девушка, а слева мусорное ведро. Удобно же, все рядом.
В середине дорожных переходов островки безопасности, где пешеход может переждать красный сигнал не рискуя быть сбитым. Островки огорожены бордюрами. Тот редкий случай, когда они действительно нужны.
Люки преимущественно квадратные. Ливневые решетки спроектированы так, чтобы туда не попало колесо велика или коляски. Направление щелей для воды перпендикулярно движению на этом участке. Если место проходное, делают просто круглые дыры.
Везде велодорожки и велопарковки, на великах ездят от мала до велика, даже беременные и пенсионеры. У меня нет статистики, но полагаю, это резко снижает дорожный трафик. К примеру, на небольшом пятачке возле магазина припарковано 15 велосипедов. Теперь представим, что каждый приехал на машине. Сколько места понадобиться? В десять раз больше, плюс трафик.
Дорожные пробки смехотворны по сравнению с нашими. 26 километров по городу от центра до аэропорта – свободно, остановки только на светофорах. В будни после обеда, Карл.
Многие женщины ходят без лифчика. Не поголовно, конечно, но больше, чем у нас. Часто видно татуировки. Это не фрики с пирсингом, просто небольшие узоры или надписи. Наряжаются кто как хочет, порой провакационно, всем пофиг.
И это я только про урны и про тратуары. А еще же парки, скверы, дворы и миллион других вещей. Можно днями гулять и выпитывать. В плане городской среды Берлин на высоте.
-
Educational startups
Today, educational startups are everywhere. The number of them is growing daily. There hardly might be a week when no new HTML/CSS/Javascript educational site appears on the Internet.
Sorry if I offend someone, but I don’t believe in remote education through special sites that offer video lessons. More precisely, there might be some effect of course but quite poorer than standard education with people and books.
About ten years ago, the Internet was full of paid video courses on DVDs. These were home-made lessons made by students where they spoke on basics of PHP+HTML. Today, the startups remind me those DVDs. The only difference is you do not need to order a disk anymore but watch lessons online right after you have paid.
I’m not running an educational site. I also don’t want to reduce someone’s reputation or business. Let’s just discuss some points that seem important to me.
Let’s switch to a browser right now and google for “educational startup” phrase. For me, the first link is “Are Education Startups The New Dot Com?” In that article, there is another link titled “Edtech Is The Next Fintech”. Read them, they highlight my concerns.
Below, there is a bunch of list-like posts titled “N best educational startups”:
- 16 Startups Poised to Disrupt the Education Market
- 10 emerging edtech startups of India - YourStory.com
- 10 Indian Education Startups to Look Out for in 2017
- TechCrunch Disrupt 2016 - 7 EdTech Startups
- 29 edtech startups in Southeast Asia
And so on. It was only the two first pages. Let’s check for Angel List then:
14,615 COMPANIES 2,646 INVESTORS 19,127 FOLLOWERS 2,675 JOBS
14615 companies. There are about 250 countries in the world. I’ll take rather 200 due to wars or lack of development in some of them. Dividing a number of educational startups on the number of counties gives 14615 / 200 = 73 per country.
Don’t you think it’s bit more than we need?
Look, we have about 3 search engines to find anything on the Internet. There are probably 2-5 mobile operators in your country. We’ve got one Wikipedia. But there are 14615 educational companies who want to make money on educational market.
I really doubt they are about real education
In fact, there is one goal that a startup tries to reach. It forces users to buy lessons making them think it could really raise their level. Site developers bring challenging factors to manipulate users. These are top-ranked tables, a scale of education decorated as a route with milestones and a crown or a goblet in the end. Any startup brings a chat-room to let users share their success to feel proud.
I’ve seen several educational sites and must confess they don’t bring any revolutionary ideas. Yes, some of them provide smart environment when you have an IDE into a browser. But the truth is there are only two ways of education that really work. These are people and books.
Sometimes, people ask me how to improve their experience. Usually they’ve spent some time solving educational tasks but they cannot start a real project. I always answer the same: join a project where professionals work. Being in a team of high-qualified people, you will grow up in a quite short term.
Reading books helps you to systematize fragments of random knowledge that you’ve got from Twitter, StakOverflow, Wired and so on. Everything that educational sites try to sell you has already been published in books. Really, there are tons of books about PHP, HTML, Java, Python or whatever. You may borrow them for free.
On the internet, you can by any used book for several dollars. A lesson is usually paid for subscription. In a month, you will lose your access unless you pay again. The book will be yours forever.
Yes, reading a book is a bit more difficult than watching a video. It’s hard and boring. You even need to interpret code in mind. There is no widgets and chats. Although, it really works.
A book is really important today since our knowledge is not arranged. We are getting random fragments missing important details. I may compare a book with an asphalt paver that moves slowly but removes any roughness and holes in your mind.
People and books are the only way to learn
Recently, I finished reading “Web-development with Clojure” book. Although I thought I knew Clojure pretty well, the book turned into discovery for me. I’ve got plenty of hints and technics that I’m willing to try in the future.
Education in wide meaning is hard process. Educational startups make you think it has become easy. That is wrong.
Once I finished reading my first Clojure book, I could solve any primitive task like sorting or finding min/max element in a list using recursion. But it took huge effort to write an HTTP server that manages database connection, renders templates, writes logs, calls Twitter API and so on.
The reason was all the lessons and tutorials miss some special knowledge about how to manage with complexity. Hot to connect parts of your application together.
A computer is not a best tool for education. It’s a great tool but also an entertainment center. Ideally, you should turn off all the messagers, close YouTube, Twitter… On your laptop, there a lot of things may interrupt you. Being tired, your brain can always find a pretext to switch on something funny.
IT-education is not about coding, it also includes negotiations. Sometimes, you might be 100% right but would not be able to deliver your ideas. You may offend your customer with non-suitable manner of speaking. Only people who work close to you may correct that mistakes, not online lessons. By the way, I have never seen a lesson that highlights anything from those mentioned above.
Of course, I do not blame startups for spreading across the Internet. The reason it happens is a lack of professionals. This is reaction of market. Did you want more programmers? Here you’ve got them. Young people know that IT companies are interested in hiring more people. Oftenly, watching just several videos enough to get PHP/Wordpress job paid in US dollars.
Universities are not in charge of your further employment. There is a common situation when you’ve just finished the last course but cannot find a job because the industry changes so fast. Nor the government will support you. Education seems to be the last thing they are interested in. Today, my country wages two wars, plays geopolitics and eradicates imported cheese while education level goes down-wise year after year.
Educational sites might be a hope. But they devalue the real meaning of education.
The process of self-education is hard. Only you is interested in it, not startups
OK I’m about to finish and I’ve got a question. I heard, the most powerful language is Lisp. At least Alan Kay said so (a guy who invented OOP). So did Stallman, Dijkstra and other great programmers. Do you know any educational site which offers Lisp lessons?
I googled for a bit to find any. Nothing on Coursera – the most known lesson hub. Nothing on Netology. At least Hexlet has SICP section where they retell it in Russian (see my note on books).
No, they won’t teach you the most powerful language. So what is the final goal then? Do we need more Python or Java programmers? We have already got lots of them but we still suffer from weird interfaces and buggy applications. It won’t work in such a way.
Let me summarize.
- I’m not against someone’s business. I even believe some people who watched those videos really made progress in their career. Maybe, they could not find proper books or their brain feels good with such a way of education.
- But I’d like to name things properly. Educational startups are the business by themselves. The goal of the business is to make money but not to make you cleverer. Instead, the longer you keep your monthly subscription active, the better a startup develops.
- There is definitely overheat on the educational market. It seems to be like a bubble.
- Startups tell you the education is easy and funny. It’s not.
- People and books help a lot. Online lessons – well, yes but quite less.
-
Weekly links #26
-
Good news for those who are afraid of being not employed with Clojure.
-
Clojure In London: Healthunlocked
Originally HealthUnlocked was built on top of .NET, but over the last few years they have migrated to a Microservice based architecture featuring Clojure.
-
Postgres to Datomic on Hacker News
Nice comments there. By the way, thank you everybody who has read my previous post.
-
Russian Internet Pioneer Anton Nosik Dies At 51
Goodbye Anton and thank you. Your contribution is invaluable.
-
-
Migration from Postgres to Datomic
Recently, I migrated my Clojure-driven pet project from PostgreSQL to Datomic. This is Queryfeed, a web application to fetch data from social networks. I’ve been running it for several years considering it as a playground for some experiments. Long ago, it was written in Python, then I ported it to Clojure.
It was a great experience when I just finished reading “Clojure for True and Brave” book and was full of desire to apply new knowledge to something practical rather than solving Euler problems in vain.
This time, I’ve made another effort to switch the database backend to Datomic. Datomic is a modern, fact-driven database developed in Cognitect to be used in conjunction with Clojure. It really differs for classical RDBS such as MySQL or PostgreSQL. For a long time, I’ve been thinking whether I should try it. Meanwhile, more and more Clojure/conj talks have been publishing on YouTube. At my work, we use vast PostgreSQL database and the code base is tied to close to it. There is no an option to perform a switch on weekends. So I decided to port my pet project to Datomic in my spare time.
Surely, before doing this, I googled for a while and was really wondered about how few information I found on the Internet. There were just three posts that did not cover the subject in details. So I decided to share my experience here. Maybe it would help somebody with their migration duties.
Of cause, I cannot guarantee the steps described below will meet your requirements as well. Each database is different, so it’s impossible to develop a final tool that could handle all the cases. But at least you may borrow some of those.
Table of Contents
- Introduction
- Dump Postgres database
- Adding Datomic into your project
- Loading the data into Datomic
- Update the code
- HTML templates
- Remove JDBC/Postgres
- Update unit test
- Infrastructure (final touches)
- Conclusion
Introduction
Before we begin, let’s talk about what is the reason to switch to Datomic. That question cannot be answered just in one or two points. Before Datomic, I’ve been working with PostgreSQL for several years and reckon it as a great software. There is no such a task that Postgres cannot deal with. Here are just some of them:
- streaming replication, smart sharding;
- geo-spatial data, PostGIS;
- full-text search, trigrams;
- JSON(b) data structures;
- typed arrays;
- recursive queries;
- and tons of other benefits.
So if Postgres is really so great, why switching then? In my point of view, it brings the following benefits into a project:
- Datomic is simple. In fact, it has only two operations: read (querying) and write (transaction).
- It supports joins as well. Once you have a reference, it can be resolved into a nested map. References may be recursive. In PostgreSQL or any other RDBS, you always have a plain result with possibly duplicated rows. The ORM logic that may deal with parsing raw SQL response might be too complicated to understand.
- Datomic was developed in the same terms as Clojure was. These are simplicity, immutability and declarative style. Datomic shares Clojure’s values.
- It accumulates changes through time like Git or any other control version system. With Datomic, you may always roll-back in time to get a history of an order or collect audit logs.
Let’s highlight some general steps we should pass through to complete the migration. These are:
- dump you Postgres data;
- add Datomic into your project;
- load the data into Datomic;
- rewrite the code that operates on data;
- rewrite your HTML templates;
- update or add unit tests;
- remote JDBC/Postgres from your project;
- setup infrastructure (backups, console, etc)
As you see, it is not as simple as it could be thought even for a small project. In my case, migrating Queryfeed took about a week working by nights. It includes:
- two days to read the whole Datomic documentation;
- one day to migrate the data;
- two days to fix the business logic code and templates;
- two days to deploy everything to the server.
Regarding to the documentation, I highly recommend you to read it first before doing anything. Please do not rely on random Stack Overflow snippets. Datomic is completely different than classical SQL databases, so your long-term Postgres or MySQL experience won’t work.
Quick tip here, since it could be difficult to read lots of text from a screen, I just download any page I wish to read into my Kindle using the official Amazon extension for Chrome. The paper appears on my Kindle in a minute and I read it.
Once you’ve finished with the docs, feel free to the next step: dumping your PostgreSQL data.
Dump Postgres database
Exporting you data into a set of files won’t be so difficult I believe. I may guess your project has
projectname.db
module that handles the most of database stuff. It should haveclojure.java.jdbc
module imported and*db*
ordb-spec
variables declared. Your goal is for every table you have in the database, run a query something likeselect * from <table_name>
against it and save the result into a file.What file format to use depends on your own preferences, but I highly recommend the standard
EDN
files rather than JSON, CSV or whatever. The main point in favor ofEDN
is it handles extended data types such as dates and UUIDs. In my case, every table has at least one date field, for examplecreated_at
that is not null and is set with the current time automatically. When using JSON or YAML, the dates will be just strings so you need to write extra code to restore a nativejava.util.Date
class from a string. So are unique identifiers, UUIDs.In addition, since
EDN
files represent native Clojure data structures, you don’t need to addorg.clojure/data.json
dependency into your project. Everything can be made with out-from-the-box functions. The next snippet dumps all the users from your Postgres database into ausers.edn
file:(def *db* {... your JDBC spec map...}) (def query (partial jdbc/query db-spec)) (spit "users.edn" (with-out-str (-> "select * from users" query prn)))
And that is! With only one line of code, you’ve just dumped the whole table into a file. Repeat it several times substituting a name of an
*.edn
file and a table. If you have many tables, wrap it with a function:(defn dump [table] (spit (format "%s.edn" table) (with-out-str (-> (format "select * from %s" table) query prn))))
Then run it against a vector of table names but not a set since an order is important. For example, if you have a user has a foreign key to
orders
table, it should be loaded first.To check whether your dump is correct, try to restore it from a file as follows:
(-> "users.edn" slurp read-string first)
Again, it is so simple to perform such things in Clojure. Within one line of code, you have just read the file, restored the Clojure data from it and took the first map from a list. In REPL, you should see something like:
{:id 1 :name "Ivan" :email "test@test.com" ... other fields }
That means the dump step was done as well.
Adding Datomic into your project
Here, I won’t discuss on that step so long since it is highlighted as well in the official documentation. Briefly, you need to:
- register on Datomic site, it is free;
- set up your GPG credentials;
- add Datomic repository and the library into your project;
- (optional) if you use Postgres-driven backend for
Datomic, create a new Postgres database using SQL scripts from
sql
folder. Then run a transactor.
Below, here is a brief example of my setup:
;; project.clj (defproject my-project "0.x.x" ... :repositories {"my.datomic.com" {:url "https://my.datomic.com/repo" :creds :gpg}} :dependencies [... [com.datomic/datomic-pro "0.9.5561.50"] ...] ...)
Run
lein deps
to download the library. You will be probably prompted to input your GPG key.A quick try in REPL:
(require '[datomic.api :as d]) (def conn (d/connect "datomic:mem://test-db"))
Loading the data into Datomic
In this step, we will load the previously dumped data into your Datomic installation.
First, we need to prepare the schema before loading the data. A schema is a collection of attributes. Each attribute by itself is a small piece of information, for example a
:user/name
attribute keeps a string value and indicates a user’s name.An entity is a set of attributes linked together by system identifier. Thinking in RDBS terms, an attribute is a DB column whereas an entity is a row of a table. That really differs Datomic from such schema-less databases as MongoDB for example. In Mongo, every entity may have any structure you wish even across the same collection. In Datomic, you cannot write a string value into a number or a boolean into a date. One note, an entity may own an arbitrary number of attributes.
For example, in Postgres if you did not set default values for a column and it is not null, you just cannot skip it when inserting a row. In Datomic, you may submit as many attributes as you want when performing a transaction. Imagine we have a user model with ten attributes: a name, email, etc. When creating a user, I may pass only a name and there won’t be an error. So pay attention you submit all the required attributes.
Datomic schema is represented by native Clojure data structures: maps, keywords and vectors. That’s why they are stored in EDN files as well. A typical initial schema for fresh Datomic installation may look as follows:
[ ;; Enums {:db/ident :user.gender/male} {:db/ident :user.gender/female} {:db/ident :user.source/twitter} {:db/ident :user.source/facebook} ;; Users {:db/ident :user/pg-id :db/valueType :db.type/long :db/cardinality :db.cardinality/one :db/unique :db.unique/identity} {:db/ident :user/source :db/valueType :db.type/ref :db/cardinality :db.cardinality/one :db/isComponent true} {:db/ident :user/source-id :db/valueType :db.type/string :db/cardinality :db.cardinality/one} ... ]
The first four ones are special attributes that are proposed as enum values. I will discuss more on them later.
Again, check for the official documentation that describes schema usage.
Now that we prepared a schema, let add some boilerate code in our
db
namespace:(ns project.db (:require [clojure.java.io :as io] [datomic.api :as d])) ;; in-memory database for test purposes (def db-uri "datomic:mem://test-db") ;; global Datomic connection wrapped in atom (def *conn (atom nil)) ;; A function to initiate the global state (defn init-db [] (d/create-database db-uri) (reset! *conn (d/connect db-uri))) ;; reads an EDN file located in `resources` folder (defn read-edn [filename] (-> filename io/resource slurp read-string)) ;; reads and loads a schema from EDN file (defn load-schema [filename] @(d/transact @*conn (read-edn filename)))
I hope the comments highlight the meaning of the code as well. I just declared a database URL, a global connection, a function to connect to the DB and two helper functions.
The first function rust reads a
EDN
file and returns a data structure. Since our files a stored in resources folder, there is aio/resource
wrapper here in the threading chain.The second function also read a file but also performs a Datomic transaction passing data as a schema.
The
db-uri
variable is represented with URL-like string. Currently, we use in-memory storage for test purposes. I really doubt you can load the data directly to SQL-driven storage without errors so let’s just practice for a while. Later, when the import step will be ready, we will just switchdb-uri
variable to production-ready URL.With the code above, we are ready to load the schema. I put my initial schema into a file
resources/schema/0001-init.edn
so I may load it as follows:(init-db) (load-schema "schema/0001-init.edn")
Now that we have a schema, let’s load the previously saved Postgres data. We need to add more boilerate code. Unfortunately, there cannot be a common function that may map your Postgres fields into Datomic attributes. The functions to convert your data might look a bit ugly, but they are one-time-purpose only so please don’t mind.
For each EDN file that contains data of a specific table, we should:
- read a proper file, get a list of maps;
- convert each PostgreSQL map into Datomic map;
- perform Datomic transaction passing a vector of Datomic maps.
Below, here is an example of my
pg-user-to-datomic
function that accepts a Postgres-driven map and turns it into a set of Datomic attributes:(defn pg-user-to-datomic [{:keys [email first_name timezone source_url locale name access_token access_secret source token status id access_expires last_name gender source_id is_subscribed created_at]}] {:user/pg-id id :user/email (or email "") :user/first-name (or first_name "") :user/timezone (or timezone "") :user/source-url (URI. source_url) :user/locale (or locale "") :user/name (or name "") :user/access-token (or access_token "") :user/access-secret (or access_secret "") :user/source (case source "facebook" :user.source/facebook "twitter" :user.source/twitter) :user/source-id source_id :user/token (UUID/fromString token) :user/status (case status "normal" :user.status/normal "pro" :user.status/pro) :user/access-expires (or access_expires 0) :user/last-name (or last_name "") :user/gender (case gender "male" :user.gender/male "female" :user.gender/female) :user/is-subscribed (or is_subscribed false) :user/created-at (or created_at (Date.))})
Yes, it looks ugly a bit annoying, but you have to write something like this for every table your have.
Here is the code to load a table into Datomic:
(->> "users.edn" slurp read-string (map pg-user-to-datomic) transact!)
Before we go further, let’s discuss some important notes on importing the data.
Avoid nils
Datomic does not support nil values for attributes. When you do not have a value for an attribute, you should either skip it or pass an empty value: a zero, an empty string, etc. That’s why the most of expressions have
(or "")
at the end of threading macro.Shrink your tables
Migrating to the new datastore backend is a good chance to refactor your schema. For those who has spent years working with relational database it is not a secret that typical SQL applications suffer from lots of tables. In SQL, it is not enough to keep just “entities” tables: users, orders, etc. Often, you need to associate a product with colors, a blog post with tags or a user with permissions. That leads to
product_colors
,post_tags
and other bridge tables. You join them in a query to “go through” from a user to their orders, for example.Datomic is free from bridge tables. It supports reference attributes that are linked to any other entity. In addition, each attribute may carry multiple values. For example, if we want to link a blog post with a set of tags, we’d rather declare the following schema:
[ ;; Tag {:db/ident :tag/name :db/valueType :db.type/string :db/cardinality :db.cardinality/one :db/unique :db.unique/identity} ;; Post {:db/ident :post/title :db/valueType :db.type/string :db/cardinality :db.cardinality/one} {:db/ident :post/text :db/valueType :db.type/string :db/cardinality :db.cardinality/one} {:db/ident :post/tags :db/valueType :db.type/ref :db/cardinality :db.cardinality/many} ]
In Postgres, you will need
post_tags
bridge table withpost_id
andtag_id
foreign keys. In datomic, you simply pass a vector of IDs in:post/tags
field when creating a post.Migrating to Datomic is a great chance to get rid of those tables.
Use enums
Both Postgres and Datomic provide support of enum types. A enum type is a set of values. An instance of enum type may have only one of those values.
In Postgres, I use enum types a lot. They are fast, reliable and provide strong consistency of you data. For example, if you have an order with possible “new”, “pending” and “paid” states, please don’t use
varchar
type for that. Somehow you may write something wrong there, for example mix up the register or make a misprint. So you’d better to declare the schema as follows:create type order_state as enum ( 'order_state/new', 'order_state/pending', 'order_state/paid' ); create table orders ( id serial primary key, state order_state not null default 'order_state/new'::order_state, ... );
Now you cannot submit an unknown state for an order.
Although Postgres enums are great, JDBC library makes our life a bit more difficult by forcing us to wrap enum values into
PGObject
when querying or inserting data. For example, to submit a new state for an order, you cannot just pass a string"order_state/paid"
. You’ll get an error saying you are trying to submit a string fororder_state
type column. So you have to wrap your string into a special object:(defn get-pg-obj [type value] (doto (PGobject.) (.setType type) (.setValue value))) (def get-order-state (partial get-pg-obj "order_state")) ;; now, composing parameters for a query {:order_id 42 :state (get-order-state "order_state/paid")}
Another disadvantage here is inconsistency between select and insert queries. When you just read the data, you get the enum value as a string. But when you pass a enum as a parameter, you still need to wrap it with PGObject. That is a bit annoying.
Datomic also has nice support of enums. There is no a special syntax for them. Enums are special attributes that do not have values but only names. Above, I have already highlighted them:
[ {:db/ident :user.gender/male} {:db/ident :user.gender/female} {:db/ident :user.source/twitter} {:db/ident :user.source/facebook} ]
Later, you may reference a enum value passing just a keyword
:user.source/twitter
. It’s quite simple, fast and keeps your database consistent.JSON data
Personally, I try to avoid using JSON in Postgres as long as it is possible. Adding JSON fields everywhere turns your Postgres installation into MongoDB. It becomes quite easy to make a mistake or corrupt the data and fall into a situation when one half or your JSON data has a particular key and the rest half does not.
Sometimes you really need to keep JSON in your DB. A good example might be Paypal Instant Notifications. These are HTTP requests that Paypal sends to your server when a customer buys something. IPN’s body keeps about 30 fields and its structure may vary depending on transaction type. Splitting that data into separate fields and storing all of them across separate columns will be a mess. A solution will be to fetch only the most sensible ones (date, email, sum, order number) and write the rest data into a
jsonb
column. Then, once you need to fetch any additional information from an IPN, for example a tax sum, you may query it as well:select data->'tax_sum'::numeric as tax from ipn where order_number = '123456';
In Datomic, there is no JSON type for attributes. I’m not sure I made a proper decision, but I just put those JSON data into a text attribute. Sure, where is no a way to access separate fields in a datalog query or apply roles to them. But at least I can restore the data when selecting a single entity:
;; local handler to parse JSON with keywords in keys (defn parse-json [value] (json/parse-string v true)) (defn get-last-ipn [user-id] (let [query '[:find [(pull ?ipn [*]) ...] :in $ ?user :where [?ipn :ipn/user ?user]] result (d/q query (d/db @*conn) user-id)] (when (not-empty result) (let [item (last (sort-by :ipn/emitted-at result))] (update item :ipn/data parse-json)))))
Foreign keys
In RDBS, a typical table has auto-incremental
id
field that marks a unique number of that row. When you need to refer to another table, an order or a user’s profile, you declare a foreign key that just keeps a value for those id. Since they are auto-generated, you should never bother on their real values, but only consistency.In Datomic, you do not have possibility to have auto-incremented values. When you import your data, it’s important to handle foreign keys (or references in terms of Datomic) properly. During the import, we populate
:<entity>/pg-id
field that holds the legacy Postgres value. Once you import a table with foreign keys, you may resolve a reference as follows:{... ;; other order fields :order/user [:user/pg-id user_id]}
A reference attribute may be represented as vector of two where the first value is an attribute name and the second is its value.
For new entities created in production after migration to Datomic, you do not need to submit
.../pg-id
value. You may either delete it (retract) once the migration process has been finished or just keep it in the database as an indicator that marks legacy data.Update the code
This step would be the most boring, I believe. You need to scan the whole project and fix those fragments where you access the data from the database.
Since it is a good practice to prepend attributes with a namespace, the most common change would be attribute renaming I believe:
;; before (println (:name user)) ;; after (println (:user/name user))
You will face less problems by organizing special functions that wraps the underlying logic. A good example might be to add
get-user-by-id
,get-orders-by-user-id
and so on.If you use HugSQL or YeSQL Clojure libraries than you already have such functions created dynamically from
*.sql
files. That is quite better than having naked SQL everywhere. Porting such a project to Datomic will be much easier.HTML templates
Another dull step that cannot be automated is to scan your Selmer templates (if you have them in your project, of course) and to update those fragments where you touch entities’ attributes. For example:
;; before <p>{{ user.first_name}} {{ user.last_name}}</p> ;; after <p>{{ user.user/first-name}} {{ user.user/last-name}}</p>
You may access nested entities as well. Imagine a user has a reference to their social profile:
<p>{{ user.user/profile.profile/name }}</p> ;; "twitter", "facebook" etc
Datomic encourages us to use enums which values are just keywords. Sometimes, you need to implement
case...then
pattern in your Selmer template and render any content depending on enum value. This may be a bit tricky since Selmer does not support keyword literals. In the example above, a user has:user/source
attribute that references a enum with possible values:user.source/twitter
or:user.source/facebook
. Here is how I figured out switching on them:{% ifequal request.user.user/source.db/ident|str ":user.source/twitter" %} <a href="https://twitter.com/{{ user.user/username }}">Twitter page</a> {% endifequal %} {% ifequal request.user.user/source.db/ident|str ":user.source/facebook" %} <a href="{{ user.user/profile-url }}">Facebook page</a> {% endifequal %}
In the example above, we have to turn a keyword into a string using
|str
filter to compare both values as strings.To find all the Selmer variables or operators in Selmer, just grep your templates folder by
{{
or{%
literals.Remove JDBC/Postgres
Now that your project is Datomic-powered and does not need JDBC drivers anymore, you may either remove them from the project or at least decrease them to the
dev
dependencies needed only for development purposes.Scan you project grepping it with
jdbc
,postgres
terms to find those namespaces that still use legacy DB backend. Remove any that still present. Open your rootproject.clj
file, removejdbc
andpostgresql
packages from:dependencies
vector. Ensure you may run and build the application and unit tests as well.Update unit test
Datomic is a great tool in those aspect you may use in-memory backend when running tests. That makes them pass quite faster and without needing setting up Postgres installation on you machine.
I believe your project is able to detect whether it is in
dev
,test
orprod
mode. If it’s not, take a look at Luminus framework. It’s done quite well in that meaning. For each type of environment, you specify its own database URL. For test, it will be in-memory storage.Using the standard
clojure.test
namespace, you wrap each test with a fixture that does the following steps:- creates a new database in memory and connects to it;
- runs all the schemas against it (migrations);
- populates it with predefined test data (users, orders etc; also know as “fixtures”);
- runs the test itself
- drops the database and closes and disconnects from it.
These steps should be run for each test. In that case, we can guarantee what every test has its own environment and does not depend on other tests. It’s a good practice when a test accepts a fresh installation not being touched by previous tests.
Some preparation steps are:
(ns your-project.test.users (:require [clojure.test :refer :all] [your-project.db :as db])) (defn migrate [] "Loads all the migrations" (doseq [file ["schema/0001-init.edn" "schema/0002-user-updated.edn"]] (db/load-schema file)) (defn load-fixtures [] "Loads all the fixtures" (db/load-schema "fixtures/test-data.edn")) (defn test-fixture [f] (db/init) ;; this function reads the config, ;; creates the DB and populates ;; the global Datomic connection (migrate) (load-fixtures) (f) ;; the test is run here (db/delete) ;; deletes a database (db/stop)) ;; stops the connection (use-fixtures :each test-fixture)
Now you may write your tests as well:
(deftest user-may-login ...) (deftest user-proceed-checkout ...)
For every test, you will have a database running with all the migrations and test data loaded.
If you still do not have any tests in your project, I urge you to add them soon. Without tests, you cannot be sure you do not break anything when changing the code.
Infrastructure (final touches)
In the final section, I will highlight several important points that relate to the server management.
Setting up production Postgres-driven backend
Running in-memory Datomic database is fun since it really costs nothing. In production, you would better set up more reliable backend. Datomic supports Postgres storage system out from the box. To prepare the database, run the following SQL scripts:
sudo su postgres # switch to postgres user cd /path/to/datomic/bin/sql psql < postgres-user.sql psql < postgres-db.sql psql datomic < postgres-table.sql
The scripts above create a user
datomic
with the passworddatomic
, then the databasedatomic
with the ownerdatomic
. The last script creates a special table to keep Datomic blocks.Please do not forget to change the standard
datomic
password to something more complicated.Running the transactor
The following page describes how to run a transactor needed by peer library when you use non-memory data storage. I’m not going to retell it here. Instead, I will share a bit of config to run it automatically using the standard
init.d
Linux daemon.Create a file named
datomic.conf
in yourmy-project/conf
directory. Put a symlink to/etc/init.d/
folder that references that file. Add the following lines into it:description "Datomic transactor" start on runlevel startup stop on runlevel shutdown respawn setuid <your user here> setgid <your group here> chdir /path/to/datomic script exec bin/transactor sql-project.properties end script
There,
/path/to/datomic
is a directory where unzipped Datomic installation is located.sql-project.properties
is a transactor configuration file where you should specify your Datomic key sent to your email.Now that you have put a symlink, try the following commands:
sudo start datomic status datomic # datomic start/running, process 5281 sudo stop datomic
Console
Most of RDBS have UI applications to manage the data. Datomic comes with built-in console that is run as web application. Within those console, you can examine the schema, perform queries and transactions.
The following template runs a console:
/path/to/datomic/bin/console -p 8088 <some-alias> <datomic-url-without-db>
In my example, the command is:
$(DATOMIC_HOME)/bin/console -p 8888 datomic \ "datomic:sql://?jdbc:postgresql://localhost:5432/datomic?user=xxxxx&password=xxxxx"
Opening a browser at
http://your-domain:8888/browser
will show you a dashboard.Some security issues may be mentioned here. The console does not support any login/password authentication, so it is quite unsafe to run the console on production server as-is. Implement at least some of the following steps:
- Proxy the console with Nginx. It must not be reachable by itself.
- Limit access by a list of IPs. These may be your office or your home only.
- There should be only secure SSL connection allowed, no plain HTTP. Let’s encrypt would be a great choice (see my recent post).
- Add basic/digest authentication to your Nginx config.
To run a console as a service, create another
console.conf
file in/etc/init.d/
directory. Use thedatomic.conf
file as template. Substitute the primary command with those one shown above. Now you can run the console only when you really need it:sudo start console
Backups
Making backups regularly is highly important. Datomic installation carries a special utility to take care of it. You won’t need to make your backups manually by running
pgdump
against Postgres backend. Datomic provides a high-level backing up algorithm that performs in several threads. In addition, it supports AWS S3 service as a destination point.A typical backup command looks as follows:
/path/to/datomic/bin/datomic -Xmx4g -Xms4g backup-db <datomic-url> <destination>
To access AWS servers, you need to export both
AWS_ACCESS_KEY_ID
andAWS_SECRET_KEY
variables first or prepend a command with them. In my case, the full command looks something like:AWS_ACCESS_KEY_ID=xxxxxx AWS_SECRET_KEY=xxxxxxx \ /path/to/datomic/bin/datomic -Xmx4g -Xms4g backup-db \ datomic:sql://xxxxxxxx?jdbc:postgresql://localhost:5432/datomic?user=xxxxxx&password=xxxxxxx" \ s3://secret-bucket/datomic/2017/07/04
The date part in the end is substituted automatically using
$(shell date +\%Y/\%m/\%d)
expression in Makefile or the following in bash:date_path=`date +\%Y/\%m/\%d` # 2017/07/04
Add that command into your crontab file to make backups regularly.
Backups as a way to deploy the data
The good news are backup’s structure does not depend on the backend type. No matter you dump in-memory storage or Postgres cluster, the backup can be restored everywhere as well. It gives us possibility to migrate the data on our local machine, make a backup and then restore it into production database.
Once you finished migrating you data, launch the backup command described above. The backup should go to S3. On the server, run the
restore
command:AWS_ACCESS_KEY_ID=xxxxx AWS_SECRET_KEY=xxxxx /path/to/datomic/bin/datomic -Xmx4g -Xms4g restore-db \ s3://secret-bucket/datomic/2017/07/04 \ "datomic:sql://xxxxxxx?jdbc:postgresql://localhost:5432/datomic?user=xxxx&password=xxxxx"
When everything is done without mistakes, the server will catch the new data.
Conclusion
After spending about a week on moving from Postgres to Datomic I can say it really worths it. Although Datomic does not support most of the Postgres smart features like geo-spatial data or JSON structures, it is much closer to Clojue after all. Since it was made by the same authors, Datomic looks like as a continuation of Clojure. And that is a huge benefit that may overweight disadvantages mentioned above.
Surfing the Internet, I found the next links that may also be helpful:
- A Minor Victory with Datomic
- Datomic Setup
- Using Datomic? Here’s How to Move from PostgreSQL to DynamoDB Local
I hope you enjoyed reading this material. You are welcome to share your thoughts in the commentary section.
-
Let's encrypt
I’ve just tried Let’s encrypt service and may say it works like a charm! I am really impressed by it’s simplicity and robustness. It really works as it’s promised within several lines in shell. That’s how a good software should be made.
Let’s encrypt is an SSL authority service that issues short-term SSL certificates for you. A typical certificate expires in 90 days and then you request for a new one.
What’s the point to use exactly Let’s encrypt? There are some other SSL providers who also offer free certificates, just google for “free SSL cert”. The main reason is Let’s encrypt is totally automated. You don’t even need to open their site. The whole setup might be done in bash session in 5 minutes.
Here is a quick example of setting up a SSL certificate on outdated Ubuntu 12.04 LTS:
-
Download
certbot
script. Certbot is an open source software to communicate with Let’s encrypt service via secure ACME protocol:wget https://dl.eff.org/certbot-auto chmod a+x certbot-auto
-
Backup your Nginx config by copying your
*.conf
files from/etc/nginx/conf.d/
somewhere. Then run:sudo /path/to/certbot-auto --nginx
This command will ask you several questions. In most cases, the default choice would be enough. It scans your current Nginx config and makes required changes. Finally, you will be prompted for submitting your email. Please enter an existing one since it requires confirmation. In a minute, check your inbox and follow the secret link to submit your account.
-
Reload Nginx service with something like
sudo service nginx restart
Open your site in Chrome, go to Developer console, “Security” tab, “View certificate” below the green label:
First, all the labels should be green but not red or orange. Second, “Let’s encrypt” authority should be noticed in the certificate’s details:
-
You have gone through the main steps so far, although it would be great to setup automatic update for your certificate. Add the following into
crontab
config:0 */12 * * * /path/to/certbot-auto renew --no-self-upgrade
This job tries to update the certificate twice a day as the official guide recommends.
To find out more, please examine the Certbot documentation. It has nice setup wizard with step-by-step algorithms for all the operation systems. You may also automate Let’s encrypt not with bash script but within your favorite language. See the “Client options” page to observe existing libraries.
Finally, I urge you to enable SSL for your project right now if you haven’t done this yet. Nowadays, there cannot be an excuse for sending your client’s data as-is without encryption. Please respect your visitors. Setting up SSL has never been so easy as it is with Let’s encrypt nowadays.
-
-
Weekly links #25
Пишу вам из Турции. Держитесь, всего одно платье осталось!
-
Object-Oriented Declarative Input/Output in Cactoos
Instead of calling static procedures we want to use objects, the way they are supposed to be used.
-
Принципиальный недостаток Git CLI
Эмоциональный обзор недостатоков Гита.
-
«Медленный» секс — это как бы не горячо. Поэтому большинство людей и пытается оторвать своим любовникам пенис или клитор — им кажется, что это очень страстно, даже если их партнерам это не слишком приятно.
-
…ты будешь бедной лет до тридцати самое меньшее. Гарвард? Никого не волнует. Работы нет.
-
Инвестиции на бирже: популярные заблуждения
Азы инвестирования. Горячо рекомендую к прочтению.
-