Force SSL for your site with Varnish and Nginx


For those of you who depend on Varnish to offer robust caching and scaling potential to your web stack, hearing about Google’s prioritization (albeit arguably small, for now) of sites that force SSL may cause pause in how to implement.

Varnish currently doesn’t have the ability to handle SSL certificates and encrypt requests as such. It may never actually have this ability because its focus is to cache content and it does a very good job I might add.

So if Varnish can’t handle the SSL traffic directly, how would you go about implementing this with Nginx?

Well, nginx has the ability to proxy traffic. This is one of the many reasons why some admins choose to pair Varnish with Nginx. Nginx can do reverse proxying and header manipulation out of the box without custom modules or configuration. Combine that with the lightweight production tested scalability of Nginx over Apache and the reasons are simple. We’re not interested in that debate here, just a simple implementation.

Nginx Configuration

With Nginx, you will need to add an SSL listener to handle the ssl traffic. You then assign your certificate. The actual traffic is then proxied to the (already set up) non-https listener (varnish).

The one thing to note before going further is the second last line of the configuration. That is important because it allows you to avoid an infinite redirect loop of a request proxying to varnish, varnish redirecting non-ssl to ssl and back to nginx for a proxy. You’ll notice that pretty quickly because your site will ultimately go down :(

What nginx is doing is defining a custom HTTP header and assigning a value of “https” to it :

So the rest of the nginx configuration can remain the same (the configuration that varnish ultimately forwards requests in order to cache).


What you’re going to need in your varnish configuration is a minor adjustment :

What the above snippet is doing is simply checking if the header “X-Forwarded-Proto” (that nginx just set) exists and if the value equals (case insensitive) to “https”. If that is not present or matches , it sets a redirect to force the SSL connection which is handled by the nginx ssl proxy configuration above. Its also important to note that we are not just doing a clean break redirect, we are still appending the originating request URI in order to make it a smooth transition and potentially not break any previously ranked links/urls.

The last thing to note is the error 750 handler that handles the redirect in varnish :

You can see that were using a 302 temporary redirect instead of a permanent 301 redirect. This is your decision though browsers tend to be stubborn in their own internal caching of 301 redirects so 302 is good for testing.

After restarting varnish and nginx you should be able to quickly confirm that no non-SSL traffic is allowed anymore. You can not only enjoy the (marginal) SEO “bump” but you are also contributing to the HTTPS Everywhere movement which is an important cause!

Auto updating Atomicorp Mod Security Rules


If any of you use mod_security as a web application firewall, you might have enlisted the services of Atomicorp for regularly updating your mod_security ruleset with signatures to protect against constantly changing threats to web applications in general.

One of the initial challenges, in a managed hosting environment, was to implement a system that utilizes the Atomicorp mod_security rules and update them regularly on an automated schedule.

When you subscribe to their service, they provide access credentials in order to pull the rules. You then need to integrate the rule files into your mod_security implementation and gracefully restart apache or nginx to ensure all the updated rules are loaded.

We developed a very simple python script, intended to run as a cron scheduled task, in order to accomplish this. We thought we would share it here in case anyone else may find it useful at all to accomplish the same thing. This script could easily be modified to download rules from any similar service, alternatively. This script was written for nginx, but can be changed to be integrated with apache.

Find the code below. Enjoy!

Add Captcha to Sugar CRM Web to Lead forms


Capturing leads via web based forms is something that is pretty standard in many industries that rely on internet marketing for sales.

One of the many leading CRM (Customer relationship management) systems, which also happens to have an open source “community” edition is Sugar CRM.

Out of the box, Sugar CRM community edition does not offer the ability for anti-spam measures such as captcha. By default, implementing a web to lead form that integrates Sugar onto your public facing website appears to become a magnet for spam form submissions. Spammers can scrape indexed google results for specific fingerprints that are indicative of “spammable” web forms. This can happen quickly after implementing a form, as your site gets re-indexed by google.

Sometimes it can be very bad, which for us, it motivated us to implement reCaptcha (Google’s Captcha library) with the web to lead Sugar CRM forms.

It was much easier than we thought. Here’s how to do it with your Sugar CRM web to lead form :

Implement Recaptcha right near your submit button on the form

Add the following code (or the code in reCaptcha’s latest instructions) :

Its important to note that you’re not fundamentally altering how the Sugar CRM web to lead form works. Your just including the recaptcha library and displaying the captcha input box, with the captcha image of course.

The form, at this point, will still submit and be processed by Sugar regardless of what you enter in the captcha box. The next step is to include the recaptcha “check” in the actual Sugar Lead processing function.

Basically the recaptcha check, out of the box, does a simple check of the captcha input and “dies” if the input is incorrect. If its correct, you can put whatever php code in the “else” statement, which in Sugars case would be the actual form processing.

Process the captcha and submit the lead form

For Sugar CRM 6.5.x, the file you want to edit is modules/Campaigns/WebToLeadCapture.php. This file is supposed to have a check built in that allows you to overwrite this file with a leadCapture_override.php file in the root folder. This allows the changes you make to be “upgrade safe”, meaning that if you upgrade sugar, the changes wont get overwritten.

Here is the recaptcha “check” that verifies captcha input :

Notice the “else” statement at the bottom, thats what you want to have the Sugar code that processes the lead form execute. You dont want Sugar to do ANYTHING if the captcha was not verified.

Edit the WebToLeadCapture.php file and add the above code around line 58, or above the following code that starts checking the html form’s POST values :

Simply put the else statement right above the above code, and ensure the opening and closing brackets for the recaptcha else statement encompass all the subsequent code, right to the bottom of the file, ensuring the closing bracket is below the following line :

The entirety of the changed WebToLeadCapture.php file is here :

Hopefully this will help reduce your spam entries with your Sugar CRM lead forms!

Web based system to purge multiple Varnish cache servers


We have been working with varnish for quite a while. And there is quite a lot of documentation out there already for the different methods for purging cache remotely via Curl, the varnish admin tool sets and other related methods.

We deal with varnish in the Amazon Cloud as well as on dedicated servers. In many cases varnish sits in a pool of servers in the web stack before the web services such as Nginx and Apache. Sometimes purging specific cache urls can be cumbersome when you’re dealing with multiple cache servers.

Depending on the CMS you are using, there is some modules / plugins that are available that offer the ability to purge Varnish caches straight from the CMS, such as the Drupal Purge module.

We have decided to put out a secure, web accessible method for purging Varnish cached objects across multiple varnish servers. As always, take the word “secure” with a grain of salt. The recommended way to publish a web accessible method on apache or nginx that gives the end-user the ability to request cache pages be purged would be to take these fundamentals into consideration :

– Make the web accessible page available only to specific source IPs or subnets
– Make the web accessible page password protected with strong passwords and non-standard usernames
– Make the web accessible page fully available via SSL encryption

On the varnish configuration side of things, with security still in mind, you would have to set up the following items in your config :


Set up an access control list in varnish that only allows specific source IPs to send the PURGE request. Here is an example of one :

vcl_recv / vcl_hit / vcl_miss / vcl_pass

This is self explanatory (I hope). Obviously you would be integrating the following logic into your existing varnish configuration.

The code itself is available on our GitHub Project page. Feel free to contribute and add any additional functionality.

It should be important to note that what differentiates our solution among the existing ones out there is that our script will manipulate the host headers of the Curl request in order to submit the same hostname / url request across the array of varnish servers. That way the identical request can be received by multiple varnish servers with no local host file editing or anything like that.

There is lots of room for input sanity checks, better input logic and other options to perhaps integrate with varnish more intuitively. Remember this is a starting point, but hopefully it is useful for you!

Web based system to push your GIT code


Since posting recently about our Web based SVN push system , we have decided to take what we did there one step further and implement a very similar system for GIT, but with more options!

The web based GIT push system is, as mentioned, very similar to the web based SVN push system, with the exception that you can select branches before exporting the code.

I should stress before continuing that this system is not intended to be publicly visible on a website. Strict access controls need to be implemented in front of this implementation to protect the integrity and protect from malicious users. For example, only making this system available on a Development LAN, or putting it behind an IP restricted firewall, with IP restricted apache/nginx rules, web authentication and SSL will allow for a much more secure implementation of this system. My advice is to always assume everything is vulnerable at any time. Working backwards with that assumption has always been a good policy for me.

First of all the entire solution is available on GitHub for you to preview.

I’ll go through each file individually, briefly explaining what each file does.

This is a straightforward file. There is a small amount of php code embedded in this file with HTML to present the push page in a simple HTML table. An array is built for all the sites you want to push (in this example case its a Dev and Prod site). The array makes it very easy to add additional sites. Each array propagates a source, destination, site name and site url within.

The only field that is really used is the “pushname” variable in each site array. That variable gets passed to the shell script that actually takes care of the pushing mechanism.

The remaining php code in this file builds a list of sites based on the array, as well as pulling the current branch by running a function included in that pulls all the branches associated with a repository and saves it to a text file for easy parsing. The other function pulls the last time the site was pushed or “exported”, giving an easy reference when dealing with multiple developers.

It should be noted that it is best to implement apache/nginx web based access on a per-user basis in order to access this page. This is because the index.php file parses the username of who is accessing the site for logging purposes. So every user that needs to access this needs an htpasswd user/password created for them for security and accountability purposes.
This file is where many of the functions lie (obviously). There is a crossite scripting function that is used to filter any submit input. I realize this is not very secure, but with the security considerations I mentioned in the beginning of this post, it should suffice. A good systems administrator would implement many hardware, software and intrusion layers to prevent malicious users from injecting content such as snort and mod_security. Nothing beats the security of a completely offline web accessible page on an internal LAN, obviously.

Next we have some functions that grab the branches, get the current branch that the site has been previously pushed on, some log file functions for storing the log file info and writing the log data and displaying it as well. All of these functions are intended to help keep the development process very organized and easy to maintain.

This file is where the index.php file POSTS the data of the site you want to push. This file receives the data as a $_POST (with the XSS cleaner function mentioned earlier sanitizing as best as it can) and then passes that variable to the push bash shell script in order to do the actual file synchronization.

It might be possible to do all the file synchronization in php, but I felt that separating the actual git pulling and rsync process into a separate shell script made the process less obfuscated and confusing. The shell script rarely needs to change unless a new site is added obviously.

This file is simply loaded as an iframe within index.php when someone clicks to view the export log. It parses the log.txt file and displays it. The export log format can be customized obviously, but usually would contain the site name, username who pushed, date and time as well as the branch pushed.

This is self explanatory and contains the log information detailed in log.php
This is the push bash shell script that gitupdate_process.php calls. Again this can be consolidated to be 100% PHP but I felt segmenting it was a good idea. You can see that the command line arguments are parsed from a $_POST in gitupdate_process.php and then passed to the shell script as an argument. This is very simple and shouldn’t be too hard to understand. The arguments would basically be the site name ($1) and the git branch name that was selected from the dropdown box before hitting the export button ($2).

That’s it! This package for GIT has made many developers’ life easier and caused less headaches when diagnosing problems or even rolling back to a stable branch. Keeping a stable and organized development environment is key here, with the security considerations I mentioned earlier being paramount above everything else.

I hope that this script was helpful and would welcome any suggestions to improve it further :)

Looking to integrate TREB listings into WordPress? Check this post by Shift8 Out

Hey there,

I thought I’d link to a blog post by our web design and development site regarding a new Python based tool that allows you to grab TREB listings, listing images and all the TREB data and directly import the extracted data into a WordPress site.

The entire Python code was released with the GPL license and is available for anyone to modify and use. Of course we will be offering the integration and installation of this system as a service for those who are not as technically inclined to implement this system themselves. Its definitely something that will make a difference in the 3RD party TREB data handling industry, as many commercial offerings along these lines are mainly based on the SaaS (software as a service) business model.

Offering a reliable and stable system of extracting TREB listings and importing them directly into wordpress, with an installation service as a one-time fee is quite exciting and should hopefully give many agents, web designers and developers out there a solid alternative option to paying for a service via a subscription model.

In any case, check out the post here : Shift8 TREB WordPress Integration with Python

You can also check out the GitHub page for the open source project here : GitHub Shift8 TREB WordPress Project

Centralized Backup Script

Hello There!

I thought I’d share a backup script that was written to consolidate backups onto one server instead of spreading the backup process across several servers. The advantages are somewhat obvious to consolidating the script onto one server, namely being that editing or making changes is much easier as you only have one script to edit.

The environment where this may be ideal would be for environments with 15-20 servers or less. I’d recommend a complete end-to-end backup solution for servers that exceed that number such as Bacula perhaps.

The bash shell script that I pasted below is very straightforward and takes two arguments. The first is the hostname or ip address of the destination server you are backing up. The next (and potentially unlimited) arguments will be single quote encased folders which you would want backed up.

This script is dependent on the server the script is running on having ssh key based authentication enabled and implemented for the root user. Security considerations can be made with IP based restrictions either in the ssh configuration, firewall configuration or other considerations.

It should be explained further that this script actually connects to the destination server as the root user, using ssh key authentication. It then initiates a remote rsync command on the destination server back to the backup server as a user called “backupuser”. So that means that not only does the ssh key need to be installed for root on the destination servers, but a user called “backupuser” needs to be added on the backup server itself, with the ssh keys of all the destination servers installed for the remote rsync.

Hopefully I did not over complicate this, because it really is quite simple :

Backup Server –> root –> destination server to backup — > backupuser rsync –> Backup Server

Once you implement the script and do a few dry run tests then it might be ready to implement a scheduled task for each destination server. Here is an example of one cron entry for a server to be backed up :

A Web based system to push your SVN code through development, staging and production environments

Note the files in this post are now on GitHub

Hello there!

In development, having a seamlessly integrated process where you can propagate your code through whatever QA, testing and development policy you have is invaluable and a definite time saver.

We work with SVN as well as GIT code repository systems and have developed a web based system to “Export” or “Push” the code through development, staging and production environments as such.

I have already talked about sanitizing your code during the commit process, to ensure commit messages are standard and there are no PHP fatal errors, so now I will be showcasing you a simple web based system for propagating your code through development, staging and production servers.

This system should be on a secure web accessible page on each server. For the sake of argument , I’ll call each server the following : — development server — staging server — production server

We will be using PHP for the web based interface, and we will assume that you will be password protecting access to this page via htpasswd, as well as forcing SSL. I am also assuming that within your SVN repository, you have multiple “sites” that you will be individually pushing or exporting (svn export). Once you have the secure, password protected page (lets call it , the following PHP page will be the main index :


If you carefully look at the above code, you will see that this page will be dependent on 3 external scripts, which I will describe below. The page itself generates a list of whatever sites you want to include in the push process, within a PHP based Array. The array details important info per site such as the name, svn location, location of the files on the server as well as whatever other notes and additional info you want to provide.

Each time a site is “exported” by clicking the export button, it calls an external script called svnupdate_process.php. This executes the SVN EXPORT command, as well as logging which user requested the action into a simple text based log file. The user is determined by the authentication user that is accessing the page. The htpassword credentials you will be providing to your users should be set per-user so that it can be easier to determine who pushed the code and whatnot.

The other two external scripts are one that will view the log file in an iframe on the same page, as well as a script to extrapolate the pending commits that are in the queue since the LAST code push / svn export. That is really useful, as you can imagine.

Script to view the export log

This script, log.php is used to dump the contents of the log.txt export log file. Very simple


Simple, right? The log.php code includes a file, used for writing and reading the log.txt file. The above code depends on it, as well as the svnupdate_process.php code (described below), in order to log each time someone hits the export code button

The code of the svn export process is handled by the following script below. Again its self explanatory. PHP executes a shell command to export the svn code based on the array variables defined in the very first script. Make sure all the variables match up to whats in svn and the location of the files, and even execute a test run of the command manually with the variables. If there are problems, you can modify the command to pipe the output to a log file for further analysis. Additionally you may need to alter the permissions of the apache user so that the command can be properly run. Avoid setting the apache user to have a shell (big no-no) ,but maybe a nologin shell or something along those lines. Its completely up to you , but be very careful about the choices you make to get it to run properly.


$logtext = “Exported to {$_POST[‘site’]}”;


Finally the last script will be the script that parses the SVN log output with a date/time range from the last time the export button was pushed, until the current date and time. This will load the output in the same iframe log window on the svn page so the user can see what pending commits are in the code since the last time it was exported. Invaluable information, right?

Note that this has a function to filter out additional illegal characters to avoid cross site scripting injections. This code should be completely 100% restricted from outside public use, however it might be worth it to put this function in the svnupdate_process.php script as well. Can’t be too careful. I thought I’d include it here for you to use.


else {
echo “No queries passed!”;


Lets break down the SVN log command, so you know whats going on. I’m grabbing the SVN site array variables when the “view log” link is clicked on the svn page. I am also parsing the export log text file to get the last entry for the particular site in question, grabbing the date and time.

I am then getting the current date and time to complete the date/time range in the svn log query. The finished query should look something like this :

Note the files in this post are now on GitHub

How to detect and mitigate DoS (Denial of Service) Attacks


Occasionally with a very busy site, being behind a hefty web stack does not always have enough capacity to mitigate a significant surge in artificial (DoS) requests. Detecting and mitigating denial of service attacks is an important and time sensitive exercise that will determine the next mitigating steps that you may need to take to reduce or null route the offending traffic.

These steps are very basic and use the everyday system utilities and tools that are found in vanilla linux implementations. The idea is to utilize these tools to identify connection and request patterns.

I’m going to assume that your potential or assumed attack is going straight to port 80 (HTTP), which would be a common assumption. A typical DoS attack would just be a generation of thousands of requests to a particular page, or perhaps just to the homepage.

Check your Process and Connection Counts

It is a good idea to get a picture of how overworked your system is currently, other than the initial reports of slow site performance or perhaps mysql errors such as “The MySQL server has gone away”, or anything of the sort.

Count how many apache/httpd processes you currently have to get an idea :

Check what the CPU load is currently on the server :

So you can see the load is quite high and there are 96 apache processes that have spawned. Looks to be quite a busy server! You should take it a step further and perhaps identify how many open port 80 (HTTP) connections you have :

So thats quite a significant number of HTTP connections on this server. It could be a substantial DoS attack, especially when you consider that this may be one server in a 3 server load balanced pool. That means the numbers here are likely multiplied by three.

Still, it could be legitimate traffic! The spike could be attributed to a link on reddit, or perhaps the site was mentioned on a popular news site. We need to look further at what requests are coming in to be able to determine if perhaps the traffic is organic or artificial. Artificial traffic would typically have thousands and thousands of identical requests, possibly coming from a series of IP addresses.

How distributed a DoS attack probably depends on the skill and resources of the offending party (potentially). If its a DoS, hopefully it will be easily identifiable.

Lets take a closer look at the open connections. Maybe we can see how many connections from singular IP addresses are currently open on the server. That may help identify if the majority of the traffic is from a single or single set of sources. This information can be kept aside after our analysis is complete so that we can use that information to report and block the traffic to ultimately mitigate the attack.

What the above line essentially does is scan the open connections specifically to port 80 and filters only the IP addresses that have 45 or more open connections. This number can obviously be adjusted to whatever you like. Take a look at the different results and see what it produces.

For potentially offending IP addresses, whois them and see where they are originating from. Are they from a Country that typically isn’t interested in your site? If the site is an English language site about local school programs in a small North American city, chances are someone from China or Russia has little legitimate interest in the site.

Analyze the Requests

In this scenario, we would be analyzing the apache access logs in order to get a clearer picture of what exactly is happening that is generating the DoS. Access logs in apache are a great resource to get the source IP, request URI and other useful information that may help identify an automated tool such as LOIC or an automated botnet perhaps that is sending thousands of identical requests to your server.

Lets filter our the actual GET requests from the apache logs, sort them and count each request in order to show the highest number of requests for the same URI. If we can take this information and then cross reference it with the connection stats we gathered earlier, we should have a clear picture of who is conducting the attack and how they are doing it.

This code filters GET requests from the logs, cleans them up, sorts the results, counts all the duplicate requests , sorts it by highest number of requests, and prints the results.

Again the 45 in the last portion of the command can be changed to whatever you feel is necessary. It all depends on whats a normal request, what is legitimate traffic and how busy your server is normally.

All the data you have gathered thus far should be enough to complete a preliminary investigation into your DoS attack. As for mitigating , there are many options. I wont go too much into it because that could be considered a completely separate topic. I’ll give some suggestions for you, either way :

Block offending IPs with IPTABLES :

Use software layer mitigation such as mod_evasive or mod_security to reduce the ability of attackers to generate significant numbers of requests. Most importantly of all, use your best judgement!

SVN Offsite Backup Script : Secure offsite backup solution for SVN to Amazon S3

Hi there!

Backing up your code repository is important. Backing up your code repository to an off-site location in a secure manner is imperative. Throughout our travels and experience utilizing the SVN code repository system, we have developed a quick bash script to export the entire SVN repository, encrypt it, compress it into an archive, and then ship it (over an encrypted network connection) to Amazon S3 storage.

We will be using the (familiar) s3sync Ruby script to do the actual transport to Amazon S3, which you can find here.

Note also that this script also keeps a local copy of the backups, taken each day, for a maximum of 7 days of retention. This might be redundant since all revisions are kept within SVN itself, but I thought it would provide an additional layer of backup redundancy. The script can be easily modified to only backup a single file every night, overwriting the older copy after every backup.

Here’s the script :

Note how I have provided an example , commented out within the script, on how you can go about decrypting the encrypted SVN dump file. You can also modify this script to back up to any offsite location, obviously. Just remove the s3sync related entries and replace with rsync or your preferred transport method.

I hope this makes your life easier!