Parse shell one-liners with pyparsing



For one of my projects I needed some one-liners parser to AST. I’ve tried PLY, pyPEG and a few more. And stopped on pyparsing. It’s actively maintained, works without magic and easy to use.

Ideally I wanted to parse something like:

LANG=en_US.utf-8 git diff | wc -l >> diffs

To something like:

(= LANG en_US.utf-8)
(>> (| (git diff) (wc -l))
    (diffs))

So let’s start with simple shell command, it’s just space-separated tokens:

import pyparsing as pp


token = pp.Word(pp.alphanums + '_-.')
command = pp.OneOrMore(token)

command.parseString('git branch --help')
>>> ['git', 'branch', '--help']

It’s simple, another simple part is parsing environment variables. One environment variable is token=token, and list of them separated by spaces:

env = pp.Group(token + '=' + token)

env.parseString('A=B')
>>>[['A', '=', 'B']]

env_list = pp.OneOrMore(env)

env_list.parseString('VAR=test X=1')
>>> [['VAR', '=', 'test'], ['X', '=', '1']]

And now we can easily merge command and environment variables, mind that environment variables are optional:

command_with_env = pp.Optional(pp.Group(env_list)) + pp.Group(command)

command_with_env.parseString('LOCALE=en_US.utf-8 git diff')
>>> [[['LOCALE', '=', 'en_US.utf-8']], ['git', 'diff']]

Now we need to add support of pipes, redirects and logical operators. Here we don’t need to know what they’re doing, so we’ll treat them just like separators between commands:

separators = ['1>>', '2>>', '>>', '1>', '2>', '>', '<', '||', '|', '&&', '&', ';']
separator = pp.oneOf(separators)
command_with_separator = pp.OneOrMore(pp.Group(command) + pp.Optional(separator))

command_with_separator.parseString('git diff | wc -l >> out.txt')
>>> [['git', 'diff'], '|', ['wc', '-l'], '>>', ['out.txt']]

And now we can merge environment variables, commands and separators:

one_liner = pp.Optional(pp.Group(env_list)) + pp.Group(command_with_separator)
            
one_liner.parseString('LANG=C DEBUG=true git branch | wc -l >> out.txt')
>>> [[['LANG', '=', 'C'], ['DEBUG', '=', 'true']], [['git', 'branch'], '|', ['wc', '-l'], '>>', ['out.txt']]]

Result is hard to process, so we need to structure it:

one_liner = pp.Optional(env_list).setResultsName('env') + \
            pp.Group(command_with_separator).setResultsName('command')
result = one_liner.parseString('LANG=C DEBUG=true git branch | wc -l >> out.txt')

print('env:', result.env, '\ncommand:', result.command)
>>> env: [['LANG', '=', 'C'], ['DEBUG', '=', 'true']] 
>>> command: [['git', 'branch'], '|', ['wc', '-l'], '>>', ['out.txt']]

Although we didn’t get AST, but just a bunch of grouped tokens. So now we need to transform it to proper AST:

def prepare_command(command):
    """We don't need to work with pyparsing internal data structures,
    so we just convert them to list.
    
    """
    for part in command:
        if isinstance(part, str):
            yield part
        else:
            yield list(part)


def separator_position(command):
    """Find last separator position."""
    for n, part in enumerate(command[::-1]):
        if part in separators:
            return len(command) - n - 1


def command_to_ast(command):
    """Recursively transform command to AST.""" 
    n = separator_position(command)
    if n is None:
        return tuple(command[0])
    else:
        return (command[n],
                command_to_ast(command[:n]),
                command_to_ast(command[n + 1:]))


def to_ast(parsed):
    if parsed.env:
        for env in parsed.env:
            yield ('=', env[0], env[2])
    command = list(prepare_command(parsed.command))
    yield command_to_ast(command)
   
   
list(to_ast(result))
>>> [('=', 'LANG', 'C'),
>>>  ('=', 'DEBUG', 'true'),
>>>  ('>>', ('|', ('git', 'branch'),
>>>               ('wc', '-l')),
>>>         ('out.txt',))]

It’s working. The last part, glue that make it easier to use:

def parse(command):
    result = one_liner.parseString(command)
    ast = to_ast(result)
    return list(ast)
    
    
parse('LANG=en_US.utf-8 git diff | wc -l >> diffs')
>>> [('=', 'LANG', 'en_US.utf-8'),
     ('>>', ('|', ('git', 'diff'),
                  ('wc', '-l')),
            ('diffs',))]

Although it can’t parse all one-liners, it doesn’t support nested commands like:

echo $(git branch)
echo `git branch`

But it’s enough for my task and support of not implemented features can be added easily.

Gist with source code.

Three months in Cyprus



Nicosia aqueduct

From March to May I was in Cyprus. Most of time I was in Nicosia, but visited most of Cyprus’s cities. Nicosia located in the middle of the island, 30-50 minutes by bus from the sea (Larnaca). The city split between two countries and has some sort of wall.

For entering country I was using Pro Visa, it’s very easy to obtain, you just need to fill some form and send to consulate by email. It allows to stay 90 days in Cyprus.

In Nicosia I was living in city center, and was renting apartment with shared kitchen for €300 including water, electricity and internet. Internet was like 20 MBit/s.

Mobile internet is a bit pricey, I was using MTN and paid €15 for 1.5GB for month. LTE available in almost all cities.

Food is relatively cheap, for everyday meat/fish/seafood I spend around €40 per week. Prices is supermarket near my apartment was €4-5 for pork, €5-6 for chicken, €5-10 for fish and €8-10 for octopus and other seafood, €0.3-1 for vegetables.

Alcohol is also cheap, €1-1.5 for can of beer, €4-5 for six pack. Local wine was around €4-5. Local strong drinks like Ouzo and Zivania was €5-10.

Main problem with shopping in Cyprus, is that shops close very early on Wednesday and Saturday, and they doesn’t work on Sundays.

Although, Nicosia is very comfortable place to live.

In this trip I also visited Israel for two weeks, lived in Tel-Aviv. And Georgia, for two weeks too, lived in Tbilisi. They’re both very nice and interesting places.

The Performance of Open Source Applications



book cover I already read all other books from this series (The Architecture of Open Source Applications volumes I and II) so I decided to read latest book – The Performance of Open Source Applications. And it’s quite a nice book. Not all chapters were interesting for me. But it was lovely to read about network stack of Chrome and about performance problems in it and about Mozilla’s stack problems and solutions.

Also the book contains great explanation why pings on mobile devices can be big, and what problems can it cause.

Lightweight events in Kotlin



In one of my applications I needed some minimalistic event system and all I can found was require creating a ton of classes or use a lot of annotations, or was magick-scala-like with overloaded -=, += and () just because author can do it.

So I decided to write something very simple, with interface like:

ClickEvent on {
    println("CLICK: ${it.x} ${it.y}")
}

ClickEvent(10, 20).emit()

And it was implemented easily:

open class Event<T> {
    var handlers = listOf<(T) -> Unit>()

    infix fun on(handler: (T) -> Unit) {
        handlers += handler
    }

    fun emit(event: T) {
        for (subscriber in handlers) {
            subscriber(event)
        }
    }
}

data class ClickEvent(val x: Int, val y: Int) {
    companion object : Event<ClickEvent>()

    fun emit() = Companion.emit(this)
}

ClickEvent on {
    println("CLICK: ${it.x} ${it.y}")
}

ClickEvent(10, 20).emit()

>>> CLICK: 10 20

You can notice a bit of boilerplate, main reason to have it is because in Kotlin we don’t have static methods, we can’t use inheritance with data classes and also we can’t use companion object of interface (or even parent class) in class that implements it. So in each event we have to duplicate some redundant code:

data class MyEvent(val data: String) {
    companion object : Event<MyEvent>()

    fun emit() = Companion.emit(this)
}

Back to positive side, let’s implement a few other events and try it in action. Start from event without arguments, it can’t be a data class:

class ConnectionLostEvent {
    companion object : Event<ConnectionLostEvent>()

    fun emit() = Companion.emit(this)
}

ConnectionLostEvent on {
    println("Oh, snap!")
}

ConnectionLostEvent().emit()

>>> Oh, snap!

What if we want events hierarchy, we can use sealed class for event type and few classes for events. They also can’t be data classes:

sealed class MouseEvent {
    companion object : Event<MouseEvent>()

    class Click(val x: Int, val y: Int) : MouseEvent() {
        fun emit() = Companion.emit(this)
    }

    class Move(val fromX: Int, val fromY: Int, val toX: Int, val toY: Int) : MouseEvent() {
        fun emit() = Companion.emit(this)
    }
}

MouseEvent on {
    when (it) {
        is MouseEvent.Click -> println("Click x:${it.x} y:${it.y}")
        is MouseEvent.Move -> println(
                "Move from ${it.fromX}:${it.fromY} to ${it.toX}:${it.toY}")
    }
}

MouseEvent.Click(50, 40).emit()
MouseEvent.Move(0, 0, 9, 12).emit()

>>> Click x:50 y:40
>>> Move from 0:0 to 9:12

As you can see it isn’t that nice that it can be implemented with Scala or Clojure. Kotlin isn’t rich language, but it’s simple and works pretty well on Android.

Also you notice that code highlighting in the article looks wrong, it’s because rouge, highlighting engine used by github pages, doesn’t support Kotlin right now, and I just used one for Scala.

Daniel Holden: Build Your Own Lisp



book cover On Reddit I stumbled on Build Your Own Lisp by Daniel Holden and decided to read it. And it’s one of the most interesting technical book I’m read. Reading this book I refreshed my knowledge of C, got few ideas for my own projects and even almost built my own Lisp.

Although in this book explains building process of not actual Lisp, it doesn’t have quotation '(f x), it uses Q-Expressions {f x}. And also it doesn’t have macros. But the book suggested readers to write all of that in the last chapter.

Freeze time in tests even with GAE datastore



It’s not so rare thing to freeze time in tests and for that task I’m using freezegun and it works nice almost every time:

from freezegun import freeze_time


def test_get_date():
    with freeze_time("2016.1.1"):
        assert get_date() == datetime(2016, 1, 1)

But not with GAE datastore. Assume that we have a model Document with created_at = db.DateTimeProperty(auto_now_add=True), so test like:

from freezegun import freeze_time


def test_created_at():
    with freeze_time('2016.1.1'):
        doc = Document()
        doc.put()
        assert doc.created_at == datetime(2016, 1, 1)

Will fail with:

BadValueError: Unsupported type for property created_at: <class 'freezegun.api.FakeDatetime'>

But it can be easily fixed if we monkey patch GAE internals:

from contextlib import contextmanager
from google.appengine.api import datastore_types
from mock import patch
from freezegun import freeze_time as _freeze_time
from freezegun.api import FakeDatetime


@contextmanager
def freeze_time(*args, **kwargs):
    with patch('google.appengine.ext.db.DateTimeProperty.data_type',
                new=FakeDatetime):
        datastore_types._VALIDATE_PROPERTY_VALUES[FakeDatetime] = \
            datastore_types.ValidatePropertyNothing
        datastore_types._PACK_PROPERTY_VALUES[FakeDatetime] = \
            datastore_types.PackDatetime
        datastore_types._PROPERTY_MEANINGS[FakeDatetime] = \
            datastore_types.entity_pb.Property.GD_WHEN
            
        with _freeze_time(*args, **kwargs):
            yield

And now it works!

Conway's Game of Life for Web and Android with ClojureScript



Recently I’ve been working on rerenderer – cross-platform react-like library for drawing on canvas. So I though it would be nice to implement Conway’s Game of Life with it.

All rerenderer applications have two essential parts: controller, that manipulates state and handle events, and view, that renders it. It’s actually MVC, but model (state) here is just an atom.

Let’s start with view:

(ns example.game-of-life.view
  (:require [rerenderer.primitives :refer [rectangle]]))

(def cell-size 40)

(defn cell
  [[x y]]
  (let [cell-x (* cell-size x)
        cell-y (* cell-size y)]
    (rectangle {:x cell-x
                :y cell-y
                :width cell-size
                :height cell-size
                :color "green"})))

(defn root-view
  [{:keys [cells]}]
  (rectangle {:x 0
              :y 0
              :width 1920
              :height 1080
              :color "white"}
    (for [cell-data cells]
      (cell cell-data))))

Here cell is the game cell, it will be rendered as a green rectangle. And root-view is the game scene that will contain all cells.

One nice thing about rerenderer, that it just renders state, so you can change :cells in editor below and see how root-view will rerender it:

Now it’s time for controller. Game logic already was implemented million times, it’s not so interesting, so I’ve just took it from sebastianbenz/clojure-game-of-life and we’ll use tick function from there:

(ns example.game-of-life.controller
  (:require-macros [cljs.core.async.macros :refer [go-loop]])
  (:require [cljs.core.async :refer [<! timeout]]))

(defn main-controller
  [_ state-atom _]
  (go-loop []
    (<! (timeout 100))
    (swap! state-atom update :cells tick)
    (recur)))

It’s very simple, every 100ms we’ll get new game state with tick and update rendering state.

And now the final part, glue that connect controller with view and state:

(ns example.game-of-life.core
  (:require [rerenderer.core :refer [init!]]
            [example.game-of-life.view :refer [root-view]]
            [example.game-of-life.controller :refer [main-controller]]))

(def inital-state {:cells [[1 5] [2 5] [1 6] [2 6]
                           [11 5] [11 6] [11 7] [12 4] [12 8] [13 3] [13 9] [14 3] [14 9]
                           [15 6] [16 4] [16 8] [17 5] [17 6] [17 7] [18 6]
                           [21 3] [21 4] [21 5] [22 3] [22 4] [22 5] [23 2] [23 6] [25 1] [25 2] [25 6] [25 7]
                           [35 3] [35 4] [36 3] [36 4]]})

(init! :root-view root-view
       :event-handler main-controller
       :state inital-state)

And that’s all, in action:

And it works on Android (apk), not without problems, it has some performance issues. And there’s strange behaviour, it’s like ten times slower when screen recorder is on. But it works without changes in code and renders on Android native Canvas:

Source code on github.

Steve Krug: Don't Make Me Think



book cover I’ve always wanted to learn more about UX and found Don’t Make Me Think by Steve Krug. It’s a short book, and it’s interesting and easy to read. It explains usability concepts by examples of web and mobile apps. The book even contains accessibility advices.

I don’t know much about UX, but in my opinion this book is nice.

Lorin Hochstein: Ansible: Up and Running



book cover white Recently I’m started to use Ansible. Before that I mostly worked with Puppet, so I decided to read something about Ansible and read Ansible: Up and Running by Lorin Hochstein.

And it’s a nice book, explains most aspects of Ansible. Shows real world use cases and a lot of playbooks.

Definitely it’s a nice book for whom who start working with Ansible.

Robert C. Martin: The Clean Coder



book cover Friends recommended me to read The Clean Coder: A Code of Conduct for Professional Programmers by Robert C. Martin. And recently I read it. The book explains some parts of engineer’s work, makes some useful points. And proves them with real life stories. Most of things in the book are well known, but it’s nice that author describes not only “good path”, but also situations when something gone wrong.

By the way, the book is interesting to read.