Martin Kleppmann: Designing Data-Intensive Applications



book cover white Recently I wanted to read something about applications design and distributed systems, so I found and read Designing Data-Intensive Applications by Martin Kleppmann. Overall it’s a nice book with a bit of theoretical and practical information. It explains how a lot of things work inside, like databases, messaging systems, and batch/stream processing. The book is a bit high-level, but I guess because of that it’s easy and interesting to read.

Although the last chapter is a bit strange, a tinfoil hat kind of strange.

Kevin R. Fall and W. Richard Stevens: TCP/IP Illustrated, Volume 1



book cover white Recently I was interested how networks work and everywhere I found recommendations to read TCP/IP Illustrated by Kevin R. Fall and W. Richard Stevens. And it’s an educative book, which explains networks even on a physical level. Almost every chapter also contains information about possible problems and vulnerabilities, which is nice and interesting.

Although reading about the same stuff twice is kind of boring, almost everything is explained for IPv4 and IPv6. And not so small part of the books looks like a reference of packets layout and headers.

But I guess it’s nice to have at least a basic knowledge about networks and this book is more than enough.

How I was Full Stack and wrote a mediocre service for searching reaction gifs



screenshot

A while ago I discovered that people often use MRW as a reply in messengers. And not knowing that nowadays even Android keyboard have an option to search gifs, I decided to write some service with a mobile app, a web app and even a Telegram bot, that will do that.

The idea was pretty simple, just parse Reddit, somehow index gifs and allow users to search and share reaction gifs in all possible ways. The result was mediocre at best.

MRW result is mediocre at best

result is mediocre at best

The first part is the parser, it’s pretty simple and can do just two things:

  • index n top of all time posts from r/MRW on an initial run;
  • index n top today posts every 12 hours.

While indexing it gets appropriate links from Reddit’s own images hosting or Imgur, got additional information from nlp-service and put everything in ElasticSearch.

I decided to write it in Clojure because wanted to. The only problem was that Elastisch, a Clojure client for ElasticSearch, wasn’t (doesn’t?) work with the latest version of ElasticSearch. But ElasticSearch REST API is neat, and I just used it.

MRW library doesn’t work with the latest elasticsearch

library doesn't work with the latest elasticsearch

The next and the most RAM consuming part is the nlp-service, it’s written in Python with NLTK and Flask. It also can do just two things:

  • sentiment analysis of a sentence, like {"sentiment": "happiness"} for “someone congrats me”;
  • VADER, which is a some sort of sentiment analysis too, like {"pos": 0.9, "neg": 0.1, "neu": 0.2} for the same sentence.

It doesn’t work very well, because I’m amateur at best in NLP, and had a too small dataset. I was and still planning to make a better dataset with Mechanical Turk in the future.

MRW I have too small dataset

I have too small dataset

The last non-client part is the public facing API, it’s also very simple, written in Clojure, Ring and Compojure. It has just one endpoint /api/v1/search/?query=query. It just requests additional information for the query from nlp-service and searches appropriate gifs in ElasticSearch. Nothing interesting.

MRW public facing api is boring

public facing api is boring

The first client is the web app (source). It’s neat, has just one text input for query and written with ClojureScript and reagent. And it’s so small, that I don’t even use re-frame here.

MRW the web app is neat

the web app is neat

The second client is the mobile app (source). It can search for reaction gifs and can share found gifs to other apps. It’s written with React Native in JavaScript and works only on Android. Yep, I managed to write non-cross-platform RN app, but at least I’m planning to make it cross-platform and publish it to the AppStore.

MRW I managed to write non-cross-platform RN app

I managed to write non-cross-platform RN app

And the last and the most hipsterish client is the Telegram bot (source). It has three types of responses:

  • to /mrw query with appropriate reaction gif;
  • to just /mrw with famous Travolta gif;
  • to /help with obviously help message.

And it’s written in JavaScript with Node.js Telegram Bot API.

MRW I can’t find Travolta gif

I can't find Travolta gif

The last part is deploy. Everything is deployed on docker-cloud. I somehow managed to configure everything a few days before swarm mode announce, so it’s just stacks. But it wouldn’t be a problem to migrate to new swarm mode. The service is deployed as eight containers:

  • ElasticSearch;
  • nginx proxy;
  • letsencrypt nginx proxy companion;
  • the nlp-service;
  • the public API;
  • the parser;
  • the web app (data container);
  • the Telegram bot.

Almost everything worked out of the box, I only changed nginx proxy image to simplify serving assets of the web app. And it’s more than nice, when I push changes to github, docker-cloud rebuilds images and redeploys containers.

MRW almost everything works out of the box

almost everything works out of the box

Summing up everything, it was a totally full stack experience from the cluster on docker-cloud with microservices to the Telegram bot and the mobile app. And the result isn’t the worst part, the worst part is that as a part of my studying I’ve made a presentation for future Software Engineers about that service in Czech.

MRW I’ve made a presentation

I've made a presentation

Sources on github, the web app, the mobile app, the Telegram bot.

Import packages depending on Python version



While working on py-backwards and it’s setuptools integration I found an interesting problem, how to transparently allow people with different Python versions to use different packages. So, imagine we have a package with a structure:

package
  __init__.py
  _compiled_2_7
    package
      __init__.py
      main.py
  _compiled_3_3
    package
      __init__.py
      main.py
  _compiled_3_4
    package
      __init__.py
      main.py
  _compiled_3_5
    package
      __init__.py
      main.py
  _compiled_3_6
    package
      __init__.py
      main.py

And, for example, on Python 2.7 we need to use the package from _compiled_2_7, on 3.3 from _compiled_3_3 and etc.

First of all, we know that if we import package.main, package.__init__ would be imported first:

# __init__.py
print('init')

# main.py
print('main')

# REPL
>>> import package.main
init
main

>>> from package import main
init
main

# shell
 python -m package.main
init
main

So it works all the time. And now if we put special bootstrap script, which would change sys.path depending on current Python version, we can easily import the package that we need:

import sys
import os

VERSIONS = {'_compiled_2_7': (2, 7),
            '_compiled_3_0': (3, 0),
            '_compiled_3_1': (3, 1),
            '_compiled_3_2': (3, 2),
            '_compiled_3_3': (3, 3),
            '_compiled_3_4': (3, 4),
            '_compiled_3_5': (3, 5),
            '_compiled_3_6': (3, 6)}
root = os.path.abspath(os.path.dirname(__file__))


def _get_available_versions():
    for version in os.listdir(root):
        if version in VERSIONS:
            yield version


def _get_version():
    available_versions = sorted(_get_available_versions())[::-1]
    for version in available_versions:
        if VERSIONS[version] <= sys.version_info:
            return version


# We should pass `__name__` as an argument, because
# we can't access `__name__` after module deletion
def _import_module(name):
    version = _get_version()
    version_path = os.path.join(root, version)
    sys.path.insert(0, version_path)
    del sys.modules[name]
    __import__(name)


_import_module(__name__)

Let’s try it on our package, but first, we need to fill _compiled_*/package/__init__.py with:

print('init 2.7')  # replace with required version

And _compiled_*/package/main.py with:

print('main 2.7')  # replace with required version

So now we can try it with Python 2.7:

# REPL
>>> import package
init 2.7

>>> import package.main
init 2.7
main 2.7

>>> from package import main
init 2.7
main 2.7

# shell
 python2 -m package.main
init 2.7
main 2.7

And with Python 3.5:

# REPL
>>> import package
init 3.5

>>> import package.main
init 3.5
main 3.5

>>> from package import main
init 3.5
main 3.5

# shell
 python3 -m package.main
init 3.5
main 3.5

It works!

Guessing complexity of algorithms



Recently, when I was having fun with algorithms exercises, I though, do I solve them correctly? So I thought that it would be neat to somehow find the complexity of an algorithm.

Gathering the data

The first idea was to use timeit and calculate complexity based on consumed time:

import timeit
from numpy.random import randint


def _get_data_set(size):
    return list(randint(1000, size=size))


def main(alist):
    for passnum in range(len(alist) - 1, 0, -1):
        for i in range(passnum):
            if alist[i] > alist[i + 1]:
                temp = alist[i]
                alist[i] = alist[i + 1]
                alist[i + 1] = temp


for size in range(1000, 10000, 1000):
    print(size, '=>', timeit.timeit('main(xs)',
                                    f'xs = _get_data_set({size})',
                                    globals=globals()))                                 
1000 => 0.25852540601044893
2000 => 1.9087769229663536
3000 => 3.760379843064584
4000 => 5.552884255070239
5000 => 9.784064659965225
6000 => 10.014859249000438
7000 => 21.16908495896496
8000 => 19.5277360730106
9000 => 24.729382600053214

But it wasn’t much accurate and useful for small functions, required too big data set and required to run the function multiple times for acquiring meaningful data.

The second idea was to use cProfile:

from cProfile import Profile


profiler = Profile()
profiler.runcall(main, _get_data_set(1000))
profiler.print_stats()
         3 function calls in 0.269 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.269    0.269    0.269    0.269 t.py:43(main)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

But it tracks only function calls and is useless for small algorithmic functions. So I found line_profiler line-by-line profiler:

from line_profiler import LineProfiler


profiler = LineProfiler(main)
profiler.runcall(main, _get_data_set(1000))
profiler.print_stats()
Total time: 2.21358 s
File: /home/nvbn/exp/complexity/t.py
Function: main at line 43

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    43                                           def main(alist):
    44      1000         1232      1.2      0.1      for passnum in range(len(alist) - 1, 0, -1):
    45    500499       557623      1.1     25.2          for i in range(passnum):
    46    499500       729304      1.5     32.9              if alist[i] > alist[i + 1]:
    47    243075       278005      1.1     12.6                  temp = alist[i]
    48    243075       334224      1.4     15.1                  alist[i] = alist[i + 1]
    49    243075       313194      1.3     14.1                  alist[i + 1] = temp

Which works fairly well for this task.

And as for guessing complexity we almost always need only the magnitude of growth, we can just calculate all hits, like:

def _get_hits(fn, *args, **kwargs):
    profiler = LineProfiler(fn)
    profiler.runcall(fn, *args, **kwargs)
    hits = 0
    for timing in profiler.get_stats().timings.values():
        for _, line_hits, _ in timing:
            hits += line_hits
    return hits


for size in range(100, 1000, 100):
    print(size, '=>', _get_hits(main, _get_data_set(size)))
100 => 17257
200 => 69872
300 => 159227
400 => 279847
500 => 429965
600 => 622175
700 => 852160
800 => 1130426
900 => 1418903

So to proof that it’s correct, we can visualize it with matplotlib:

import matplotlib.pyplot as plt


result = [(size, _get_hits(main, _get_data_set(size)))
          for size in range(100, 1000, 100)]

plt.style.use('ggplot')
plt.plot([size for size, _ in result],
         [hits for _, hits in result])
plt.title('Bubble sort: O(n^2)')
plt.show()

On the plot we can obviously see that the complexity is quadratic:

Bubble sort: O(n^2)

Finding complexity

For finding most the appropriate complexity, we need to have a special functions like def get_score(hits: List[Tuple[int, int]]) -> int. And a complexity with the lowest score will be the most appropriate.

Let’s start with constant complexity:

def constant(xs):
    result = 0
    for n in range(5):
        result += n
    return result
    
    
hits = [(size, _get_hits(constant, _get_data_set(size)))
        for size in range(100, 1000, 100)]
[(100, 13), (200, 13), (300, 13), (400, 13), (500, 13), (600, 13), (700, 13), (800, 13), (900, 13)]

Constant: O(1)

We can see, that hits count not depends at all on the size of the data set, so we can calculate the score by just summing diff with the first hit:

def get_constant_score(hits):
    return sum(abs(hits[0][1] - hit) for _, hit in hits[1:])
    
    
get_constant_score(hits)
0

The next is linear complexity:

def linear(xs):
    result = 0
    for x in xs:
        result += x
    return result
    

hits = [(size, _get_hits(linear, _get_data_set(size)))
        for size in range(100, 1000, 100)]
[(100, 203), (200, 403), (300, 603), (400, 803), (500, 1003), (600, 1203), (700, 1403), (800, 1603), (900, 1803)]

Linear: O(n)

We can notice that the order of growth is linear, so we can use the similar approach as with constant complexity, but with growth instead of hits:

def _get_growth(hits):
    for hit_a, hit_b in zip(hits[:-1], hits[1:]):
        yield abs(hit_a - hit_b)


def get_linear_score(hits):
    hits = [hit for _, hit in hits]
    growth = list(_get_growth(hits))
    return sum(abs(growth[0] - grow) for grow in growth[1:])
    
    
get_linear_score(hits)
0

Now it’s time for quadratic complexity:

def quadratic(xs):
    result = 0
    for x in xs:
        for y in xs:
            result += x + y
    return result


hits = [(size, _get_hits(quadratic, _get_data_set(size)))
        for size in range(100, 1000, 100)]
[(100, 20203), (200, 80403), (300, 180603), (400, 320803), (500, 501003), (600, 721203), (700, 981403), (800, 1281603), (900, 1621803)]

Quadratic: O(n^2)

It’s a bit more complicated in this case, the order of growth of the order of growth (like second derivative) should be the same:

def get_quadratic_score(hits):
    hits = [hit for _, hit in hits]
    growth = list(_get_growth(hits))
    growth_of_growth = list(_get_growth(growth))
    return sum(abs(growth_of_growth[0] - grow)
               for grow in growth_of_growth[1:])


get_quadratic_score(hits)
0

With similar approach detection of O(log(n)), O(n*log(n)) and etc can be implemented.

Usable public API

The code used above is only usable in REPL or a similar interactive environment. So let’s define simple and extensible API. First of all, complexity assumptions:

ComplexityLogEntry = NamedTuple('ComplexityLogEntry', [('size', int),
                                                       ('hits', int)])


class BaseComplexityAssumption(ABC):
    title = ''

    def __init__(self, log: List[ComplexityLogEntry]) -> None:
        self.log = log
        self.hits = [hit for _, hit in self.log]

    @abstractmethod
    def get_score(self) -> int:
        ...

So, for example, an assumption for constant complexity will be like:

class ConstantComplexityAssumption(BaseComplexityAssumption):
    title = 'O(1)'

    def get_score(self) -> int:
        return sum(abs(self.hits[0] - hit) for hit in self.hits[1:])

The next part of public API is Analyzer:

class Complexity:
    def __init__(self, title: str, log: List[ComplexityLogEntry]) -> None:
        ...

    def show_plot(self) -> 'Complexity':
        ...


T = TypeVar('T')


class BaseComplexityAnalyzer(Generic[T], ABC):
    title = ''

    @abstractmethod
    def get_data_set(self, size: int) -> T:
        ...

    @abstractmethod
    def run(self, data_set: T) -> None:
        ...
        
    @classmethod
    def calculate(cls, sizes: Iterable[int]) -> Complexity:
        ...
     
    ...

So we can analyze some quadratic algorithm very easily:

class SomethingQuadratic(BaseComplexityAnalyzer[int]):
    title = 'Something quadratic'

    def get_data_set(self, size) -> List[int]:
        from numpy.random import randint
        return list(randint(1000, size=size))

    def run(self, data_set: Iterable[int]) -> None:
        result = []
        for x in data_set:
            for y in data_set:
                result.append((x, y))
                
                
SomethingQuadratic \
    .calculate(range(100, 1000, 100)) \
    .show_plot()

And it works:

Assumption result

Gist with sources.

Robert Love: Linux Kernel Development



book cover Lately, I was interested in Linux internals, so I decided to read Linux Kernel Development by Robert Love. It’s an interesting book, it explains stuff from interrupts to vfs and memory management. It’s easy to read and contains information why historically something was done like it was.

Although the book is more for beginner kernel developers and contains a lot of code samples of the kernel internals. It’s worth reading if you just interested in how it works inside.

Telegram bot that speaks like a gopnik



book cover TLDR bot – http://t.me/swear_bot (speaks only in Russian)

Saturday night I thought that it would be nice to write a telegram bot that will speak like a gopnik, and sometimes can be used in a conversation (inline mode) to find a relevant reply. So a few beers later I implemented it. As a language I chose JavaScript, but probably Python would be better for this kind of project because of a high variety of NLP related libs. Although for Russian in js exists Az.js, which I’ve used in the bot, and this library has almost everything needed by the bot.

Worth to mention as the bot speaks in russian, examples below are in russian too, I add translation in brackets, but sometimes translation doesn’t have much sens.

The bot/telegram API part isn’t much interesting, it’s straightforward and implemented with node-telegram-bot-api.

Move to the interesting part, how it generate replies? The first option is to find trigger word and reply appropriately, like:

user: Хочу новую машину! (I want a new car!)
bot: Хотеть не вредно! (It doesn’t harm to wish)

First of all, we have a list of trigger words regexp and replies, like:

export const TRIGGERS = [
  [/^к[оа]роч[ье]?$/i, 'У кого короче, тот дома сидит!'],
  [/^хо(чу|тим|тят|тел|тела)$/i, 'Хотеть не вредно!'],
];

Now we need to extract words from our text, it’s easy with Az.Tokens, and check if a word matches any trigger regexp:

const getWords = (text: string): string[] =>
  Az.Tokens(text)
    .tokens
    .filter(({type}) => type === Az.Tokens.WORD)
    .map(({st, length}) => text.substr(st, length).toLowerCase());
  
const getByWordTrigger = function*(text: string): Iterable<string> {
  for (const word of getWords(text)) {
    for (const [regexp, answer] of constants.TRIGGERS) {
      if (word.match(regexp)) {
        yield answer;
      }
    }
  }
};

And that little generator would yield all appropriate replies to text based on trigger words.

The second option is to reply to question with an amusing question, like:

user: Когда мы уже пойдём домой? (When we’ll go home?)
bot: А тебя ебёт? (Why the fuck you care?)

So it’s very easy to implement, we just need to find a question mark at the end of text or question words such a “когда” (when), “где” (where) and etc:

const getAnswerToQuestion = (text: string): string[] => {
  if (text.trim().endsWith('?')) {
    return [constants.ANSWER_TO_QUESTION];
  }

  const questionWords = getWords(text)
    .map((word) => Az.Morph(word))
    .filter((morphs) => morphs.length && morphs[0].tag.Ques);

  if (questionWords.length) {
    return [constants.ANSWER_TO_QUESTION];
  } else {
    return [];
  }
};

It returns an array with amusing constants.ANSWER_TO_QUESTION if a message is a question, otherwise returns an empty array.

The third option is most interesting, is to reply with vulgar rhyme, like:

user: хочу в Австрию! (I want to go to Austria)
bot: хуявстрию (dickstria)
user: у него есть трактор (he have a tractor)
bot: хуяктор (dicktor)

So how it works, it just replaces first syllable in nouns and adjectives with “ху” and transformed vowel from the syllable, like “о” → “ё”, “а” → “я” and etc.

So, first of all, we need to extract the first syllable it’s very easy, we just need all characters before consonant that after the first vowel:

const getFirstSyllable = (word: string): string => {
  const result = [];

  let readVowel = false;

  for (const letter of word) {
    const isVowel = constants.VOWELS.indexOf(letter) !== -1;

    if (readVowel && !isVowel) {
      break;
    }

    if (isVowel) {
      readVowel = true;
    }

    result.push(letter);
  }

  return result.join('');
};

Then we need a function, that will replace the first vowel in a word:

const getRhyme = (word: string): ?string => {
  const morphs = Az.Morph(word);
  if (!morphs.length) {
    return;
  }

  const {tag} = morphs[0];
  if (!tag.NOUN && !tag.ADJF) {
    return;
  }

  const syllable = getFirstSyllable(word);
  if (!syllable || syllable === word) {
    return;
  }

  const prefix = constants.VOWEL_TO_RHYME[last(syllable)];
  const postfix = word.substr(syllable.length);

  return `${prefix}${postfix}`;
};

And then we need a function that extracts words from a text and returns possible rhymes:

const getRhymes = (text: string): string[] =>
  getWords(text)
    .map(getRhyme)
    .filter(Boolean)
    .reverse();

It’s pretty easy, right?

So the last option is to reply confused and aggressive when previous options aren’t working, like:

user: wtf
bot: Чё? (What?)

And it’s straightforward and implemented in a function, that aggregate all options:

export default (text: string): string[] => {
  const answers = uniq([
    ...getByWordTrigger(text),
    ...getAnswerToQuestion(text),
    ...getRhymes(text),
  ]);

  if (answers.length) {
    return answers;
  } else {
    return constants.NO_ANSWERS;
  }
}

That’s all. Telegram bot, sources on github.

How thefuck alias handle command line arguments



TLDR it’s not, thefuck uses functions instead aliases starting from 3.16.

Initially, thefuck alias wasn’t working with command line arguments, if history changing was disabled they were just passed to corrected command. So it was just like:

➜ git br
git: 'br' is not a git command. See 'git --help'.

Did you mean one of these?
	branch
	var
➜ fuck --help
git branch [enter/↑/↓/ctrl+c]
GIT-BRANCH(1)                                          Git Manual                                          GIT-BRANCH(1)

NAME
       git-branch - List, create, or delete branches

SYNOPSIS
       git branch [--color[=<when>] | --no-color] [-r | -a]
               [--list] [-v [--abbrev=<length> | --no-abbrev]]
               [--column[=<options>] | --no-column]
               [(--merged | --no-merged | --contains) [<commit>]] [--sort=<key>]
               [--points-at <object>] [<pattern>...]
...

If history changing was enabled it was less dramatic, it was executing git branch, but it was putting git branch --help in history.

Also there was a lot of feature requests, that can’t be implemented without proper command line arguments supports, like #614 and #531.

So, why there was a problem? Let’s examine old zsh alias, for simplicity with disabled history changing:

alias fuck='TF_CMD=$(
    TF_ALIAS=fuck
    PYTHONIOENCODING=utf-8
    TF_SHELL_ALIASES=$(alias)
    thefuck $(fc -ln -1 | tail -n 1)
) && eval $TF_CMD'

In a few words it was working like:

  • get previous command with fc -ln -1 | tail -n 1 (git br);
  • run thefuck with it (thefuck git br);
  • put result in TF_CMD (TF_CMD=git branch);
  • run the command with eval (eval git branch).

So when fuck was called with arguments like fuck --help it was working like:

TF_CMD=$(
    TF_ALIAS=fuck
    PYTHONIOENCODING=utf-8
    TF_SHELL_ALIASES=$(alias)
    thefuck $(fc -ln -1 | tail -n 1)
) && eval $TF_CMD --help

So in the last step eval git branch --help was called.

As everyone know, bash and zsh don’t allow command line arguments in the middle of alias. So I just replaced aliases with functions.

And there’s another problem – how to distinguish previous command arguments with fuck arguments, and the solution is a dead simple – just use some placeholder, like THEFUCK_ARGUMENT_PLACEHOLDER, between the previous command and fuck arguments.

So now zsh “alias” (with history changing) looks like:

fuck () {
    TF_PREVIOUS=$(fc -ln -1 | tail -n 1);
    TF_CMD=$(
        TF_ALIAS=fuck
        TF_SHELL_ALIASES=$(alias)
        PYTHONIOENCODING=utf-8
        thefuck $TF_PREVIOUS THEFUCK_ARGUMENT_PLACEHOLDER $*
    ) && eval $TF_CMD;
    test -n "$TF_CMD" && print -s $TF_CMD
}

It looks almost the same, but more readable and support nice stuff like:

➜ git br
git: 'br' is not a git command. See 'git --help'.

Did you mean one of these?
	branch
	var
➜ fuck -y
git branch
  614-repeate-option
  620-command-line-arguments
...

➜ hit brunch
➜ fuck -r
git brunch [enter/↑/↓/ctrl+c]
git: 'brunch' is not a git command. See 'git --help'.

Did you mean this?
	branch
git branch [enter/↑/↓/ctrl+c]
  614-repeate-option
  620-command-line-arguments
...

As a drawback, this feature is shell-specific and currently implemented only for bash and zsh. So users of fish, PowerShell and tcsh at this moment can’t use command line arguments with fuck.

Maurice Herlihy: Write Great Code, Volume 1



book cover Recently, for studying, I was sometimes required to use low-level stuff like C, so I decided to read something about low-level things and chose to read Write Great Code, Volume 1 by Randall Hyde. The book is a bit too low-level, it even explains how processors of different architectures work inside, even on transistors level. Also, this book contains interesting information about how memory works on different levels and how processor uses different types of memory. Chapters about primitive data structures aren’t so interesting, but they give basic knowledge of how things work inside.

Although this book can look a bit outdated, especially in the chapter about IO, it’s interesting to read and somehow it’s even entertaining.

Simplify complex React components with generators



Sometimes writing complex React components, like huge dynamic forms, isn’t easy. By default for control flow in JSX we use conditional operator, so, for example, a complex form with some logic will look like:

class Form extends Component {
  propTypes: {
    importedAccount: boolean,
    corporateClient: boolean,
  };

  render() {
    return (
      <div>
        <form>
          {!this.props.importedAccount ? (
              <label>
                First Name
                <input type="text"/>
              </label>
            ) : null}
          {!this.props.importedAccount ? (
              <label>
                First Name
                <input type="text"/>
              </label>
            ) : null}
          {this.props.corporateClient ? (
              <label>
                Organization
                <input type="text"/>
              </label>
            ) : null}
          {this.props.corporateClient ? (
              <label>
                Role
                <input type="text"/>
              </label>
            ) : null}
          {!this.props.corporateClient ? (
              <label>
                Billing address
                <input type="text"/>
              </label>
            ) : null}
        </form>
      </div>
    );
  }
}

But that much of ternary operators looks ugly. And explicitly writing ... ? ... : null isn’t fun at all. So we can try a bit different approach, we can extract all these inputs to separate methods, like:

class Form extends Component {
  propTypes: {
    importedAccount: boolean,
    corporateClient: boolean,
  };

  renderFirstNameInput() {
    if (!this.props.importedAccount) {
      return (
        <label>
          First Name
          <input type="text"/>
        </label>
      );
    }
  }

  renderLastNameInput() {
    if (!this.props.importedAccount) {
      return (
        <label>
          First Name
          <input type="text"/>
        </label>
      );
    }
  }

  renderOrganizationInput() {
    if (this.props.corporateClient) {
      return (
        <label>
          Organization
          <input type="text"/>
        </label>
      );
    }
  }

  renderRoleInput() {
    if (this.props.corporateClient) {
      return (
        <label>
          Role
          <input type="text"/>
        </label>
      );
    }
  }

  renderBillingAddressInput() {
    if (!this.props.corporateClient) {
      return (
        <label>
          Billing address
          <input type="text"/>
        </label>
      );
    }
  }

  render() {
    return (
      <div>
        <form>
          {this.renderFirstNameInput()}
          {this.renderLastNameInput()}
          {this.renderOrganizationInput()}
          {this.renderRoleInput()}
          {this.renderBillingAddressInput()}
        </form>
      </div>
    );
  }
}

Looks much nicer, but it’s a bit bloated, there’s too much code. Another approach is to create some generator method and convert it to an array in render method, like:

class Form extends Component {
  propTypes: {
    importedAccount: boolean,
    corporateClient: boolean,
  };

  *renderForm() {
    if (!this.props.importedAccount) {
      yield (
        <label key="first-name">
          First Name
          <input type="text"/>
        </label>
      );
      yield (
        <label key="last-name">
          Last Name
          <input type="text"/>
        </label>
      );
    }

    if (this.props.corporateClient) {
      yield (
        <label key="organization">
          Organization
          <input type="text"/>
        </label>
      );
      yield (
        <label key="role">
          Role
          <input type="text"/>
        </label>
      );
    } else {
      yield (
        <label key="billing-address">
          Billing address
          <input type="text"/>
        </label>
      );
    }
  }

  render() {
    return (
      <div>
        <form>
          {[...this.renderForm()]}
        </form>
      </div>
    );
  }
}

So it looks much nicer, there’s far way less code than in methods approach, it looks less hackish than conditional operator approach. And we finally can have easily readable control flow.