ruff.mobi - under conception

Small footprint web server

Nowdays we're all concerned about environment and our impact on it. At the same time security concerns do not allow our conscience to agree on these cloud bells and whistles.

Luckily we have now available at the consumer segment low profile and power consumption platforms like Raspberry PI and others. Compare 1-2W of raspberry to at least 10-20W of any other HTPC. Such platforms however have certain resource restrictions.

That is - no one argues apache fits its own performance and feature-richness profile and is doing its job perfectly well, however running such a monster (as it became nowdays) for a home web server where you expect not more than one visit a day, and on a raspberry - is an overkill and waste of all possible resources. I.e. memory is a hard limit, and with it possible even SD card wearing due to swap usage.

On the raspberry several lightweight web servers are provided out of the box - thttpd, lighthttpd, nginx, darkhttpd, etc.. The server of choice for this task then - nginx. Not that it's better than others in some critical area as per my knowledge - I'm simply familiar with it using it as a loadbalancer. For my setup critical is - static and dynamic content, later provided by php(tt-rss) and perl(this site).

FastCGI is our friend

nginx and other light web daemons do not intend to become application server incorporating runtime environment into itself (eg. mod_perl or mod_php). Instead it relies on external app-servers and communicates wtih them using standard protocols - wsgi or fcgi. uwsgi requires quite fat dependency which I'm not intending to use at all (python) while fcgi is quite light, actually it does not require any dependency at all since it's just a protocol. To work with perl it would suffice to install CGI::Fast (for convenience) or even pure FCGI (minimal protocol layer). PHP goes with pre-built fcgi flavour - php-fpm.

The concept is simple - use CGI-like communication environment however do not fork. That means that your application/script would need to run permanently itself as a server, accepting connections from http server and generating content in response. HTTP server here would then act as a proxy for dynamic content and would serve static content only. The win - no overhead on spawning the application on each request (which becomes a problem for scripting languages). The drawback - you'd need to more precisely manage resources doing garbage collection, closing descriptors, pooling connections - whatever you need to reach maximum efficiency.

Stop speaking, do something!

Our setup would be multilayered, one server will be running php application (tt-rss) and acting as a caching reverse proxy for the rest (accelerator mode), second layer will run perl application. Caching will be configured in aggressive/sticky mode - if downstream dies (becomes unavailable) cached content will be served till lower layer replies with something. That is usefull when upstream sits on reliable power supply and/or internet connection while downstream may disappear for some reason.

Important - when doing caching reverse proxy on nginx it is crucial to send correct cache control headers and timestamps. Otherwise it will be a mess when part of content will be cached on nginx, and nginx wouldn't attempt to refresh it, while other part will be generated with no caching.

First will be default stanza to catch everything and forward downstream with agressive caching. In global context we define non-volatile cache to store content for long time. Then standard server definition - would make sense defining it for both http and https. And finally catch-all location. Here we're making aggressive caching and downstream reverse-proxying. We also adding XFF header with client's IP address - to use downstream.

proxy_cache_path /mnt/f2fs/nginx/cache inactive=5y keys_zone=cache_zone:1m;
server {
 listen *:80 default_server;
 listen [::]:80 default_server;
 listen *:443 default_server ssl;
 listen [::]:443 default_server ssl;
 ... ssl params ...
 location / {
  expires 5m;
  proxy_pass $scheme://down.stream.net:$server_port;
  proxy_redirect https://down.dtream.net/ /;
  proxy_cache cache_zone;
  proxy_cache_revalidate on;
  proxy_cache_use_stale error;
  proxy_http_version 1.1;
  proxy_set_header Connection keep-alive;
  proxy_set_header X-Forwarded-For $remote_addr;
 }

Next will be our more specific location definition for ttrss. We'll change default url for it (to avoid scanners) so alias would be needed to map url to path. We'll also redirect all non-encrypted request to the same url under https (encrypted) schema. Nested locations are used to set common parameters for php application. PHP itself is connected via FastCGI interface. PHP location is defined using regex with capture group - to extract script name from url path and assign it to CGI parameter. Last two sections are for static content and deny closure for the rest.

 location /zxcvb {
  if ($https != "on") {
   return 301 "https://$host$request_uri";
  }
  index index.php;
  alias /opt/tt-rss;
  location ~ ^/zxcvb/(.+\.php)(/?.*)$ {
   fastcgi_index index.php;
   fastccgi_param SCRIPT_FILENAME $document_root/$1;
   include fastcgi_params;
   fastcgi_pass unix:/var/run/php5-fpm.sock;
  }
  location ~ ^/zxcvb/.+\.(css|php|js|gif|png|jpg|swg|swf|html|xml|xsl|json|ico)$ {
   expires 1h;
  }
  location ~ ^/zxcvb/.+ {
   deny all;
  }
 }
}

Link... Sun Nov 15 23:11:47 2015

Tags: