{"id":1161,"date":"2020-03-29T20:02:07","date_gmt":"2020-03-29T18:02:07","guid":{"rendered":"https:\/\/craftcoders.app\/?p=1161"},"modified":"2024-08-14T14:27:51","modified_gmt":"2024-08-14T12:27:51","slug":"wordpress-vs-static-web-pages-the-best-of-both-worlds","status":"publish","type":"post","link":"https:\/\/craftcoders.app\/wordpress-vs-static-web-pages-the-best-of-both-worlds\/","title":{"rendered":"WordPress without the security hassles \u2013 using GitLab CI\/CD to automate transforming WordPress into a static website"},"content":{"rendered":"\r\n
Recently we launched our new company website (craftcoders.app<\/a>). It\u2019s a simple website that showcases us and our work and describes the kind of services that we provide to customers. It requires no dynamic features except for the contact form.<\/p>\r\n We decided to build our website with WordPress, but to automatically generate a static copy of it and serve that to visitors. We’re using Gitlab CI\/CD as automation tool. This guide will explain how you can setup your own pipeline to generate a static website from a WordPress site on a regular schedule or manually. But first we’ll have a detailed look at the pros and cons of WordPress and static websites in the next section. Feel free to skip over it, if this is not of interest to you.<\/p>\r\n At craft-coders we value efficiency, and we try to choose the right tool for the job. WordPress is the leading CMS for small websites. It\u2019s easy to set up and deploy. At the time of writing ~35% of all websites on the internet are built with it. Because of its popularity there are tons of useful plugins and great themes available. So that you can build good-looking and feature rich websites really quickly.<\/p>\r\n But WordPress has its downsides. Mainly it sucks at security. So famously, that ~1\/5 of the Wikipedia<\/a> article on it focuses on its vulnerabilities. The plugin market for WordPress does not provide any quality checks and if you look at the code base of most plugins (even some popular ones), any self-respecting programmer will scream out in agony.<\/p>\r\n Because of this we are very much against using WordPress for more than simple representational websites and blogs. Basically if your website is built on WordPress you must expect getting hacked. It’s therefore crucial that your WordPress installation is running on a server that isn\u2019t storing any sensitive information about you or your customers and that you use passwords that are used nowhere else. If people really need to log into your website, then at best you use an external authentication service, so that no information about passwords is stored on your server.<\/p>\r\n Still, even if there is nothing of value to gain for a potential attacker, so that a targeted attack against your website is very unlikely and getting hacked is more a nuisance than an actual problem, you still need to take basic precautions. Due to the popularity of WordPress there are a lot of bots out there that just scan the web for known vulnerabilities. They try to hack as many web pages as possible and use them to spread SPAM emails, viruses and propaganda, or use your server to mine Bitcoins.<\/p>\r\n The most important thing that you must do to protect yourself from bots is to keep your WordPress installation and its plugins updated at all times. This can be very annoying because updates may break things. And for most small websites the goal is often to deploy and forget. You don\u2019t want to spend time fostering your site, but just want it to continue to function as expected and be done with it. The ultimate goal of every person in operations is, to go unnoticed. If you have an admin that is constantly running around fixing stuff he\/she is probably not doing a good job, or he\/she has to compensate for the mistakes of the developers. You want things to work without the need of thinking about it.<\/p>\r\n Even though WordPress is the nightmare of every admin, in contrast to that, static web pages are the dream of every person working in operations. They\u2019re super easy to deploy, work super fast, can be kept in RAM and requests can be distributed between as many servers as you like. Because there is no code running on the server involved, they are basically unhackable. Provided of course that your webserver is secure, but since you can just rent a managed server this isn\u2019t really an issue that you need to concern yourself with. Yes, attacks running in the clients browser exploiting flaws in JavaScript or CSS are still feasible, but since a truly static website by definition has no valuable cookies or private information to steal, there is little to be gained by performing an attack in this manner (talking to authenticated REST-Service can change that picture of course).<\/p>\r\n There are a few good static site generators out there, but as of now no one of them provides an easy-to-use GUI and as many plugins\/themes as WordPress. If your goal is to build a simple website fast, WordPress should still be your first choice. Also if you decide to go with a static site generator there is no going back, your site will forever be static. Of course, you\u2019re always free to use JavaScript to talk to REST-services and that is a good design choice, so this sounds more dramatic than it actually is.<\/p>\r\n To sum it up WordPress is great for editors and site-builders but it sucks in operations. In contrast, static web pages are hard to use by editors and usually require more development effort than WordPress, but they are great in operations. This is a classic development vs. operations issue.<\/p>\r\n What if you could have both? Why not have a private non-accessible installation of WordPress and from that generate a static copy. Then you can deploy that copy to a public accessible web space. That way you have the best of both worlds. Of course you deprive yourself of the dynamic features of WordPress, so no comment fields and no login sections, but if you don\u2019t need any of that, this is a perfect solution for you. And if your requirements ever change you can always replace your static copy with the real thing and go on with it.<\/p>\r\n This is the basic idea. The first thing we tried out was the WP2Static plugin which aims at solving this issue, but we couldn\u2019t get it running. We then decided to build our own solution using our favorite automation tool GitLab CI\/CD<\/a>. We used gitlab.com, and at the moment they are offering 2000 free ci minutes to every customer, which is a really sweet deal. But any ci-tool should do. You should not have many issues porting this guide to Jenkins or any other tool that allows to execute bash scripts. Also, we’re assuming you are using Apache (with mod_rewrite) as web server and that you can use .htaccess files. But porting this concept to other web servers shouldn\u2019t be too difficult.<\/p>\r\n You can find and fork the complete sample code here: https:\/\/gitlab.com\/sgellweiler\/demo-wordpress-static-copy.<\/a><\/p>\r\n Here is the plan in detail. We\u2019re going to use the same domain and web space to host both the private WordPress installation and the public accessible static copy. We\u2019re going to install WordPress to a sub directory, that we will protect with basic auth using a .htaccess file. This is the directory that all your editors, admins and developers will access. The Gitlab job will crawl this installation using Wget and deploy the static copy via ssh+rsync into the directory \/static on the web space. Then will use the .htaccess file in the root directory to rewrite all requests to the root path into the static directory. You can configure the gitlab job to run every day, hour or only manually depending on your needs.<\/p>\r\n To follow this guide you should have access to a *NIX shell and have the basic Apache tools (htpasswd), ssh tools (ssh-keygen, ssh-keyscan), find, sed and GNU Wget installed. Some distros ship with a minimal Wget installed, so make sure that you have the feature rich version of Wget installed (wget –version).<\/p>\r\n First install WordPress into a sub directory. For this guide I\u2019m going with wp_2789218<\/em>. You can go along with this name or choose your own, you should use a unique name tough, a string that you will use nowhere else. Best you add a few random generated chars in there. We\u2019re not doing this for security but to make search+replace for urls easier in the next step. If you go with your own folder name remember to replace all occurrences of wp_2789218 <\/em>in this guide with your folder name. We\u2019ll also add a catchy alias \/wp<\/em>, for you and your coworkers to remember, so don’t worry too much about the cryptic name.<\/p>\r\n Next we create a directory to store our static copy. We\u2019ll just name that static\/<\/em> and for now we\u2019ll just add an index.html<\/em> with <h1>Hello World<\/h1><\/em> in there.<\/p>\r\n Let\u2019s configure Apache to password protect our WordPress installation and to redirect request to \/static<\/em>. First generate a .htpasswd<\/em> file with user+password at the root-level (or at another place) of your web space using:<\/p>\r\n Next create a .htaccess<\/em> on the root level with the following. You need to reference the .htpasswd<\/em> file with an absolute path<\/strong> in the AuthUserFile<\/em>:<\/p>\r\n And that\u2019s it for the server config part. If you go to your.domain.tld<\/em> then you should see the Hello World<\/em> from the index.html<\/em> in the static directory. If you go to your.domain.tld\/wp<\/em> you should get redirected to your WordPress installation and be forced to enter a password.<\/p>\r\n To make a static copy of your website you need a crawler that will start at your start page, follow all links to sub pages and download them as html including all CSS and JavaScript. We tried out several tools and the one that performed the best by far is the good old GNU Wget. It will reliably download all HTML, CSS, JS and IMG resources. But it will not execute JavaScript and therefore fail to detect links generated through JavaScript. In this case you might run into problems. However, most simple WordPress sites should be fine from the get go.<\/p>\r\n Let\u2019s have a look at the Wget cmd we will use to generate a static copy of our WordPress site:<\/p>\r\n Here is an explanation of all the options in use:<\/p>\r\n This will generate a static copy of your WordPress installation in wp_2789218<\/em>. You can test if the crawling worked by opening the index.html<\/em> in wp_2789218<\/em> with a browser.<\/p>\r\n Wget will try to rewrite urls in HTML and css, but for meta-tags, inside of JavaScript and in other places will fail to do so. This is where the unique name of our directory comes into play. Because we named it wp_2789218<\/em> and not wordpress<\/em>, we can now safely search and replace through all files in the dump, and replace every occurrence of wp_2789218\/<\/em>, wp_2789218\\\/<\/em>, wp_2789218%2F<\/em> and wp_2789218<\/em> with an empty string (“”) so that the links will be correct again in all places. We will use find+sed for that.<\/p>\r\n Here is the mac OSX variant of that:<\/p>\r\n And here is the same for Linux with GNU sed:<\/p>\r\n To save you the headache (\\\\\\\/|%2F|\\\/)?<\/em> will match \/<\/em>, \\\/<\/em>, %2F<\/em> and empty string (“”).<\/p>\r\n Now that we have generated a static copy of our website, we want to deploy it to \/static<\/em> on the web space. You can do this over rsync+ssh, if you have ssh access to your server.<\/p>\r\n The command to do so looks like this:<\/p>\r\n Remember to adjust the user, domain and path to the directory in webspaceuser@yourdomain.tld:static<\/em> to your needs.<\/p>\r\n For our automated deployment with Gitlab, you should create a new private\/public ssh keypair using:<\/p>\r\n This will create deploy<\/em> and deploy.pub<\/em> files in your current directory. Copy the contents of deploy.pub<\/em> to ~\/.ssh\/authorized_keys<\/em> on your remote server to allow ssh-ing with it to your server. You can use this one-liner for that:<\/p>\r\n Next test, that you have set up everything correctly by ssh-ing with the new key to your web space:<\/p>\r\n For Gitlab you will need the signature of your remote ssh server. You can generate it with ssh-keyscan. Copy the output of that, because you will need it in the next step:<\/p>\r\n Now that we have established all the basics it\u2019s time to put it all together in one gitlab-ci.yml file. But first we need to configure a few variables. On your Gitlab project go to Settings \u2192 CI\/CD \u2192 Variables and create the following variables:<\/p>\r\n Our Gitlab pipeline will have two phases for now: crawl and deploy. They are going to run the commands that we discussed in the previous sections in different docker containers. This is the .gitlabci.yml<\/em>:<\/p>\r\n\r\n\r\n\r\n That’s pretty much it, now you have a pipeline that will generate a static copy of your WordPress site and upload that back to your web space. You could set up a schedule for your pipeline<\/a> to run automatically on a regular basis or you can use the Run Pipeline button<\/a> to start the process manually.<\/p>\r\n\r\n\r\n\r\n We would like to add one more step to our pipeline. It’s always good to do a little bit of testing. Especially if your executing stuff manually without supervision. If the crawler fails for whatever reason to download your complete website, you probably want the pipeline to fail before going into the deploy phase and breaking your website for visitors. So lets perform some basic sanity checks on the static copy before starting the deploy phase. The following checks are all very basic and it’s probably a good idea to add some more checks that are more specific to your installation. Just check for the existence of some sub pages, images, etc. and grep some strings. Also probably you want to make the existing rules a bit stricter.<\/p>\r\n\r\n\r\n\r\n Even the most basic web sites usually need a little bit of dynamic functionality, in our case we needed a contact form. We decided to go with Ninja Forms Contact Form<\/a>. Ninja forms work by sending requests to wp-admin\/admin-ajax-vhio8powlv.php<\/em>. This will obviously fail on our static website. To make it work, we will need to reroute requests to admin.ajax.php<\/em> to our WordPress backend.\u00a0 The admin-ajax-vhio8powlv.php<\/em> is used by all sorts of plugins, not only ninja forms and to increase security we want to only whitelist calls for Ninja Forms. Ninja form will make a POST request with application\/x-www-form-urlencoded<\/em> and the parameter action set to nf_ajax_submit<\/em>. Since there is no way (at least none that we know of) in Apache to filter for form parameters we will need to solve this in PHP. The idea is to create an alternative admin-ajax-vhio8powlv.php<\/em> to call instead, that in turn will call the wp-admin\/admin-ajax-vhio8powlv.php<\/em> in the WordPress backend, but only for Ninja Form requests. To further increase protection from bots, we will also rename the wp-admin\/admin-ajax-vhio8powlv.php<\/em> to admin-ajax-oAEhFc.php.<\/em> This won’t really help us against intelligent attackers, but it should stop most bots that try to use an exploit against wp-admin\/admin-ajax-vhio8powlv.php<\/em>. Then we will need to add the admin-ajax-oAEhFc.php<\/em> to the root of our web space. This file simply checks if this is indeed an Ninja Form call and then include the wp-admin\/admin-ajax-vhio8powlv.php<\/em> from the\u00a0 WordPress backend. After that we will fix any urls in the output that are still pointing to our WordPress site, so that they point to our static site instead.<\/p>\r\n\r\n\r\n\r\n Finally we will need to modify the .htaccess file to allow requests to admin-ajax-oAEhFc.php<\/em> and to not rewrite them to static\/<\/em>.<\/p>\r\n\r\n\r\n\r\n And that’s it. If you have done everything correctly after running your pipeline again, Ninja Forms should work.<\/p>\r\n\r\n\r\n\r\n A similar procedure should work for many other plugins too. Tough keep in mind that with every plugin you allow access to your backend, you will also increase the attack surface.<\/p>\r\n\r\n\r\n\r\n\r\n\r\n You may want to have a custom 404 page instead of the standard 404 error page that Apache will serve by default. Assuming that you have already created a nice looking 404 page in your WordPress installation, in theory we could just use Wget to make a request to an url that does not exists and use the output of that. Unfortunately Wget does a terrible job dealing with non 200 status codes, there is a –content-on-error <\/em>option that will let it download the contents of a 404 page, but it will refuse to download any images, stylesheets or other resources attached to it.<\/p>\r\n To deal with that situation we will simply create a normal page in our WordPress backend and use that as a 404 page. So create your page in WordPress and remember the url you gave it.<\/p>\r\n We can now add that url to our list of files for Wget to download and then use the .htaccess<\/em> file to redirect all 404 requests to that file.<\/p>\r\n Ok so lets add our 404 page to the wget cmd in the .gitlab-ci.yml file:<\/p>\r\n <\/p>\r\n \r\n\r\n<\/p>\r\n To redirect all 404 errors to notfound\/index.html<\/em> we will have to add one instruction to the .htaccess file:<\/p>\r\n If you have done everything correctly after you run your pipeline and visit any non exisiting url you should get your custom error page. However if you try to access a deeper level like yourdomain.tld\/bogus\/bogus\/bogus<\/em> it propabbly looks really fucked up like this:<\/p>\r\n This is because Wget will rewrite all links to be relative and we access our 404 page from different paths. To fix this we can add a <base><\/em> tag inside of the <head><\/em> with an absolute url. We insert the base tag with sed after running Wget in the .gitlab-ci.yml<\/em> like this:<\/p>\r\n \r\n\r\n<\/p>\r\n And that’s it, if you run your pipeline again the 404 page should look fine:<\/p>\r\n We have successfully created a Gitlab job that generates and publishes a static copy of a WordPress site and secured the actual WordPress backend against attacks of bots and humans. And because of the 2000 free minutes of CI that Gitlab is currently offering, it didn’t even cost us a dime. If you can live with the limitations of a static website, we’re definitely recommending this or a similar solution to you. It will push the risk of getting hacked near zero and you will no longer need to spend precious time ensuring that your site and all of it’s plugins are up to date. Also your site will be as fast as lightening. Best regards, Our Gitlab pipeline will have two phases for now: crawl and deploy. They are going to run the commands that we discussed in the previous sections in different docker containers. This is the .gitlabci.yml: That’s pretty much it, now you have a pipeline that will generate a static copy of your WordPress site and upload that back to your web … Read More<\/a><\/p>\n","protected":false},"author":2,"featured_media":2305,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[109,196],"tags":[],"acf":[],"yoast_head":"\nThe ups and downs of WordPress and static websites<\/h2>\r\n
Using WordPress to generate a static web page<\/h2>\r\n
<\/a><\/p>\r\n\r\n\r\n\r\n
Setting up the web space<\/h3>\r\n
htpasswd -c \/home\/pwww\/.htpasswd yourusername<\/code><\/pre>\r\n
RewriteEngine On\r\nRewriteBase \/\r\n\r\n# Setup basic auth\r\nAuthUserFile \/var\/www\/httpdocs\/.htpasswd\r\nAuthType Basic\r\nAuthName \"Only for trusted employees\"\r\n\r\n# Require a password for the wp installation.\r\n<RequireAny>\r\n Require expr %{REQUEST_URI} !~ m#^\/wp_2789218#\r\n Require valid-user\r\n<\/RequireAny>\r\n\r\n# Add an easy to remember alias for the wp installation.\r\nRewriteRule ^wp\/(.*) wp_2789218\/$1 [R=302,L]\r\nRewriteRule ^wp$ wp_2789218\/ [R=302,L]\r\n\r\n# Rewrite all request to the static directory.\r\n# Except for requests to the wp installation.\r\nRewriteCond %{REQUEST_URI} !^\/static.*\r\nRewriteCond %{REQUEST_URI} !^\/wp_2789218.*\r\nRewriteRule ^(.*)$ static\/$1 [L]\r\n<\/code><\/pre>\r\n
Generating a static copy of a website<\/h3>\r\n
wget \\\r\n -e robots=off \\\r\n --recursive \\\r\n -l inf \\\r\n --page-requisites \\\r\n --convert-links \\\r\n --restrict-file-names=windows \\\r\n --trust-server-names \\\r\n --adjust-extension \\\r\n --no-host-directories \\\r\n --http-user=\"${HTTP_USER}\" \\\r\n --http-password=\"${HTTP_PASSWORD}\" \\\r\n \"https:\/\/yourdomain.tld\/wp_2789218\/\" \\\r\n \"https:\/\/yourdomain.tld\/wp_2789218\/robots.txt\"\r\n<\/code><\/pre>\r\n
\r\n
Ignore instructions in the robots.txt<\/em>.
This is fine since we\u2019re crawling our own website.<\/li>\r\n
Follow links to sub directories.<\/li>\r\n
Sets the recursion level depth to infinite.<\/li>\r\n
Download stuff like CSS, JS, images, etc.<\/li>\r\n
Change absolute links to relative links.<\/li>\r\n
Change filenames to be compatible with (old) Windows. This is a useful option even if you\u2019re not running on Windows or you will get really ugly names that can cause issues with Apache.<\/li>\r\n
Uses the filenames of redirects instead of the source url.<\/li>\r\n
Download files directly into wp_2789218<\/em> and not into yourdomain.tld<\/em>.<\/li>\r\n
The username used for basic auth to access the wp installation. As defined in your .htpasswd.<\/li>\r\n
The password used for basic auth to access the wp installation. As defined in your .htpasswd.<\/li>\r\n
Lists of urls to download. We set this to the start page, Wget will recursively follow all links from there.
We also copy the robots.txt along.<\/li>\r\n<\/ul>\r\nLC_ALL=C find wp_2789218 -type f -exec sed -E -i '' 's\/wp_2789218(\\\\\\\/|%2F|\\\/)?\/\/g' {} \\;<\/code><\/pre>\r\n
find wp_2789218\/ -type f -exec sed -i -E 's\/wp_2789218(\\\\\\\/|%2F|\\\/)?\/\/g' {} \\;<\/code><\/pre>\r\n
Deploying the static copy to a web space<\/h3>\r\n
rsync -avh --delete --checksum wp_2789218 \"webspaceuser@yourdomain.tld:static\"<\/code><\/pre>\r\n
ssh-keygen -m PEM -N \"\" -C \"Deploymentkey for yourdomain.tld\" -f deploy<\/code><\/pre>\r\n
cat deploy.pub | ssh webspaceuser@yourdomain.tld -- 'mkdir -p ~\/.ssh && chmod 700 ~\/.ssh && cat - >> ~\/.ssh\/authorized_keys && chmod 600 ~\/.ssh\/authorized_keys'<\/code><\/pre>\r\n
ssh -i deploy webspaceuser@yourdomain.tld<\/code><\/pre>\r\n
ssh-keyscan yourdomain.tld<\/code><\/pre>\r\n
Putting it all together<\/h3>\r\n
\r\n
This is the private key that will be used for rsync to upload the static dir. Put the contents of the deploy<\/strong><\/em> file that you created in the step before, in here.
This should be of type File<\/strong> and state Protected<\/strong>.<\/li>\r\n
This is the public key that will be used for rsync to upload the static dir. Put the contents of the deploy.pub<\/strong><\/em> file that you created in the step before, in here.
This should be of type File<\/strong>.<\/li>\r\n
The known host file contains the signature of your remote host.
This is the output that you generated with ssh-keyscan<\/strong>.
This should be of type File<\/strong>.<\/li>\r\n
Example: webspaceuser@yourdomain.tld:static
The rsync remote to upload the static copy to. This is in the scheme of user@host:directory<\/em>.<\/li>\r\n
The url to your wordpress installation. This is the starting point for wget.
This should be of type Variable<\/strong>.<\/li>\r\n
The user used by wget to access your WordPress installation using basic auth. This is the user that you put in your .htpasswd<\/em> file.
This should be of type Variable<\/strong>.<\/li>\r\n
The password for HTTP_USER used by wget to access your WordPress installation using basic auth.
This should be of type Variable<\/strong>, state Protected<\/strong> and Masked<\/strong>.<\/li>\r\n<\/ul>\r\n\r\n\r\n\r\n<\/a><\/figure>\r\n\r\n\r\n\r\n
stages:\r\n - crawl\r\n - deploy\r\n\r\nbefore_script:\r\n - echo \"[INFO] setup credentials for ssh\"\r\n - mkdir ~\/.ssh\r\n - cp \"${SSH_ID_RSA}\" ~\/.ssh\/id_rsa\r\n - cp \"${SSH_ID_RSA_PUB}\" ~\/.ssh\/id_rsa.pub\r\n - cp \"${SSH_KNOWN_HOSTS}\" ~\/.ssh\/known_hosts\r\n - chmod 600 ~\/.ssh ~\/.ssh\/id_rsa ~\/.ssh\/id_rsa.pub\r\n\r\ncrawl:\r\n image:\r\n name: cirrusci\/wget@sha256:3030b225419dc665e28fa2d9ad26f66d45c1cdcf270ffea7b8a80b36281e805a\r\n entrypoint: [\"\"]\r\n stage: crawl\r\n\r\n script:\r\n - rm -rf wp_2789218 static\r\n - |\r\n wget \\\r\n -e robots=off \\\r\n --recursive \\\r\n --page-requisites \\\r\n --convert-links \\\r\n --restrict-file-names=windows \\\r\n --http-user=\"${HTTP_USER}\" \\\r\n --http-password=\"${HTTP_PASSWORD}\" \\\r\n --no-host-directories \\\r\n --trust-server-names \\\r\n --adjust-extension \\\r\n --content-on-error \\\r\n \"${WORDPRESS_URL}\/\" \\\r\n \"${WORDPRESS_URL}\/robots.txt\"\r\n\r\n - find wp_2789218\/ -type f -exec sed -i -E 's\/wp_2789218(\\\\\\\/|%2F|\\\/)?\/\/g' {} \\;\r\n - mv wp_2789218 static\r\n artifacts:\r\n paths:\r\n - static\/*\r\n expire_in: 1 month\r\n only:\r\n - master\r\n\r\ndeploy:\r\n image:\r\n name: eeacms\/rsync@sha256:de654d093f9dc62a7b15dcff6d19181ae37b4093d9bb6dd21545f6de6c905adb\r\n entrypoint: [\"\"]\r\n stage: deploy\r\n script:\r\n - rsync -avh --delete --checksum static\/ \"${RSYNC_REMOTE}\"\r\n dependencies:\r\n - crawl\r\n only:\r\n - master<\/code><\/pre>\r\n\r\n\r\n\r\n
stages:\r\n - crawl - verify_crawl\r\n - deploy\r\n\r\n[...]\r\n\r\nverify_crawl:\r\n image: alpine:3.11.3\r\n stage: verify_crawl\r\n script:\r\n - echo \"[INFO] Check that dump is at least 1 mb in size\"\r\n - test \"$(du -c -m static\/ | tail -1 | cut -f1)\" -gt 1\r\n\r\n - echo \"[INFO] Check that dump is less than 500 mb in size\"\r\n - test \"$(du -c -m static\/ | tail -1 | cut -f1)\" -lt 500\r\n\r\n - echo \"[INFO] Check that there are at least 50 files\"\r\n - test \"$(find static\/ | wc -l)\" -gt 50\r\n\r\n - echo \"[INFO] Check that there is a index.html\"\r\n - test -f static\/index.html\r\n\r\n - echo \"[INFO] Look for 'wordpress' in index.html\"\r\n - grep -q 'wordpress' static\/index.html\r\n dependencies:\r\n - crawl\r\n only:\r\n - master\r\n\r\n[...]<\/code><\/pre>\r\n\r\n\r\n\r\n
Adding a contact form<\/h2>\r\n\r\n\r\n\r\n
First we will need to modify the .gitlab-ci.yml<\/em>\u00a0 file to add an extra find & sed after wget to the crawl step, to change all urls from wp-admin\/admin-ajax-vhio8powlv.php” to “admin-ajax-oAEhFc.php<\/em>:<\/p>\r\n\r\n\r\n\r\n[...]\r\n- find wp_2789218\/ -type f -exec sed -i -E 's\/wp-admin(\\\\\\\/|%2F|\\\/)admin-ajax-vhio8powlv.php\/admin-ajax-oAEhFc.php\/g' {} \\;\r\n[...]<\/code><\/pre>\r\n\r\n\r\n\r\n
<?php \r\n\/* Pass through some functions to the admin-ajax-vhio8powlv.php of the real wp backend. *\/\r\n\r\n\/\/ Capture output, so that we can fix urls later.\r\nob_start();\r\n\r\n\/\/ Pass through ninja forms\r\nif ($_SERVER['REQUEST_METHOD'] === 'POST' && !empty($_POST) && $_POST['action'] == 'nf_ajax_submit') {\r\n require (__DIR__ . '\/wp_2789218\/wp-admin\/admin-ajax-vhio8powlv.php');\r\n}\r\n\r\n\/\/ Everything else should fail.\r\nelse {\r\n echo '0';\r\n}\r\n\r\n\/\/ Fix urls in output.\r\n$contents = ob_get_contents();\r\nob_end_clean();\r\n\r\n\r\n$search_replace = array(\r\n 'wp_2789218\/' => '',\r\n 'wp_2789218\\\\\/' => '',\r\n 'wp_2789218%2F' => '',\r\n 'wp_2789218' => '',\r\n 'wp-admin\/admin-ajax-vhio8powlv.php' => 'admin-ajax-oAEhFc.php',\r\n 'wp-admin\\\\\/admin-ajax-vhio8powlv.php' => 'admin-ajax-oAEhFc.php',\r\n 'wp-admin%2Fadmin-ajax-vhio8powlv.php' => 'admin-ajax-oAEhFc.php',\r\n);\r\n\r\necho str_replace(array_keys($search_replace), array_values($search_replace), $contents);\r\n<\/code><\/pre>\r\n\r\n\r\n\r\n
[...]\r\n# Rewrite all request to the static directory.\r\n# Except for requests to the wp installation.\r\nRewriteCond %{REQUEST_URI} !^\/static.*\r\nRewriteCond %{REQUEST_URI} !^\/admin-ajax-oAEhFc.php$\r\nRewriteCond %{REQUEST_URI} !^\/wp_2789218.*\r\nRewriteRule ^(.*)$ static\/$1 [L]\r\n<\/code><\/pre>\r\n\r\n\r\n\r\n
Adding a custom 404 page<\/h2>\r\n
<\/a><\/p>\r\n
[...]\r\n - |\r\n wget \\\r\n -e robots=off \\\r\n --recursive \\\r\n --page-requisites \\\r\n --convert-links \\\r\n --restrict-file-names=windows \\\r\n --http-user=\"${HTTP_USER}\" \\\r\n --http-password=\"${HTTP_PASSWORD}\" \\\r\n --no-host-directories \\\r\n --trust-server-names \\\r\n --adjust-extension \\\r\n --content-on-error \\\r\n \"${WORDPRESS_URL}\/\" \\\r\n \"${WORDPRESS_URL}\/robots.txt\" \\\r\n \"${WORDPRESS_URL}\/notfound\"<\/strong>\r\n[...]<\/code><\/pre>\r\n
ErrorDocument 404 \/static\/notfound\/index.html<\/code><\/p>\r\n
<\/a><\/p>\r\n
[...]\r\n - sed -i 's|<head>|<head><base href=\"\/notfound\/\">|' wp_2789218\/notfound\/index.html\r\n[...]<\/code><\/pre>\r\n
<\/a><\/p>\r\n
Conclusion<\/h2>\r\n\r\n\r\n\r\n
Go ahead and fork: https:\/\/gitlab.com\/sgellweiler\/demo-wordpress-static-copy<\/a>. And let us know how it works for you in the comment section.<\/p>\r\n\r\n\r\n\r\n
Sebastian Gellweiler<\/p>","protected":false},"excerpt":{"rendered":"