- nginx 2017-03-19T16:58:15+01:00 https://fadeit.dk/blog/tag/nginx.html Blacklist Referer Spam Bots with NGINX 2015-04-22T18:51:00+02:00 https://fadeit.dk/blog /2015/04/22/nginx-referer-spam-blacklist <h2 id="some-background">Some Background</h2> <p>Recently, we submitted fadeit.dk to <a href="http://www.awwwards.com/best-websites/fadeit-1/">AWWWARDS</a> - the awards for design, creativity and innovation on the Internet. It gave us a nice little boost of website visitors. However, not all of that attention was positive…</p> <h2 id="referer-spam-bots">Referer Spam Bots</h2> <p>While the majority of our traffic came from genuine sources, we started noticing a pattern in our referral traffic.</p> <p><img src="/blog/assets/nginx-referer-spam-blacklist/ga_dashboard.png" alt="Google Analytics Dashboard - Acquisition / Referrals" /></p> <p>What’s up with <code class="highlighter-rouge">social-buttons.com</code>? We didn’t sign up for that… Apparently, this is something called <a href="http://en.wikipedia.org/wiki/Referer_spam">Referer Spam</a>*:</p> <blockquote> <p>Referrer spam (also known as log spam or referrer bombing) is a kind of spamdexing (spamming aimed at search engines). The technique involves making repeated web site requests using a fake referer URL to the site the spammer wishes to advertise. Sites that publish their access logs, including referer statistics, will then inadvertently link back to the spammer’s site. These links will be indexed by search engines as they crawl the access logs. - Wikipedia</p> </blockquote> <p>This is not cool, Mr. social-buttons.com. P**s off!</p> <p><em>* No, I didn’t mispell. The mispelling actually made it into the <a href="http://tools.ietf.org/html/rfc1945">HTTP/1.0 standard</a> and now it’s there forever :)</em></p> <h2 id="nginx-solution">NGINX Solution</h2> <p>Since we’re using NGINX to serve our site, the solution is going to be described for NGINX. Apache people can take a look at this excellent <a href="https://www.addedbytes.com/blog/block-referrer-spam/">article</a>.</p> <p>The official NGINX wiki does mention <a href="http://wiki.nginx.org/Referrer_Spam_Blocking">a solution</a> to this problem. Basically, you just use <a href="http://nginx.org/en/docs/http/ngx_http_referer_module.html">ngx_http_referer_module</a>* and add something like this to your location or server block:</p> <div class="language-bash highlighter-rouge"><pre class="highlight"><code>valid_referers none blocked server_names <span class="k">*</span>.social-buttons.com social-buttons.com badreferer2.com; <span class="k">if</span> <span class="o">(</span><span class="nv">$invalid_referer</span><span class="o">)</span> <span class="o">{</span> <span class="k">return </span>444; <span class="o">}</span> </code></pre> </div> <p>This works, but what if we want to maintain a larger blacklist of referers? Our <code class="highlighter-rouge">valid_referers</code> directive would get crazy long. If that’s fine with you, you can stop reading here. It sure isn’t fine with me :).</p> <p>In order to make our blacklist more maintainable, we can use <a href="http://nginx.org/en/docs/http/ngx_http_map_module.html">ngx_http_map_module</a>. Let’s save <code class="highlighter-rouge">/etc/nginx/conf.d/blacklist.conf</code> file with the following content:</p> <div class="language-bash highlighter-rouge"><pre class="highlight"><code><span class="c"># /etc/nginx/conf.d/blacklist.conf</span> map <span class="nv">$http_referer</span> <span class="nv">$bad_referer</span> <span class="o">{</span> hostnames; default 0; <span class="c"># Put regexes for undesired referers here</span> <span class="s2">"~social-buttons.com"</span> 1; <span class="s2">"~semalt.com"</span> 1; <span class="s2">"~kambasoft.com"</span> 1; <span class="s2">"~savetubevideo.com"</span> 1; <span class="s2">"~descargar-musica-gratis.net"</span> 1; <span class="s2">"~7makemoneyonline.com"</span> 1; <span class="s2">"~baixar-musicas-gratis.com"</span> 1; <span class="s2">"~iloveitaly.com"</span> 1; <span class="s2">"~ilovevitaly.ru"</span> 1; <span class="s2">"~fbdownloader.com"</span> 1; <span class="s2">"~econom.co"</span> 1; <span class="s2">"~buttons-for-website.com"</span> 1; <span class="s2">"~buttons-for-your-website.com"</span> 1; <span class="s2">"~srecorder.co"</span> 1; <span class="s2">"~darodar.com"</span> 1; <span class="s2">"~priceg.com"</span> 1; <span class="s2">"~blackhatworth.com"</span> 1; <span class="s2">"~adviceforum.info"</span> 1; <span class="s2">"~hulfingtonpost.com"</span> 1; <span class="s2">"~best-seo-solution.com"</span> 1; <span class="s2">"~googlsucks.com"</span> 1; <span class="s2">"~theguardlan.com"</span> 1; <span class="s2">"~i-x.wiki"</span> 1; <span class="s2">"~buy-cheap-online.info"</span> 1; <span class="s2">"~Get-Free-Traffic-Now.com"</span> 1; <span class="o">}</span> </code></pre> </div> <p>Now add conditions to the sites, for which you want to block referer spam bots:</p> <div class="language-bash highlighter-rouge"><pre class="highlight"><code><span class="c"># /etc/nginx/sites-enabled/mysite.conf</span> server <span class="o">{</span> <span class="c"># ...</span> <span class="k">if</span> <span class="o">(</span><span class="nv">$bad_referer</span><span class="o">)</span> <span class="o">{</span> <span class="k">return </span>444; <span class="o">}</span> <span class="c"># ...</span> <span class="o">}</span> </code></pre> </div> <p>OK, now let’s test if this thing works:</p> <div class="language-bash highlighter-rouge"><pre class="highlight"><code><span class="c"># with subdomain</span> <span class="nv">$ </span>curl --referer http://www.social-buttons.com https://fadeit.dk/en curl: <span class="o">(</span>52<span class="o">)</span> Empty reply from server <span class="c"># without subdomain</span> <span class="nv">$ </span>curl --referer http://social-buttons.com https://fadeit.dk/en curl: <span class="o">(</span>52<span class="o">)</span> Empty reply from server </code></pre> </div> <p>Sweet! It worked.</p> <p><em>* Both <a href="http://nginx.org/en/docs/http/ngx_http_referer_module.html">ngx_http_referer_module</a> and <a href="http://nginx.org/en/docs/http/ngx_http_map_module.html">ngx_http_map_module</a> are included in the standard NGINX distribution and you don’t need to recompile your server.</em></p> <h2 id="thats-it">That’s it!</h2> <p>What’s your experience with Referer Spam? Don’t hesitate to use the comment section :)</p> <h2 id="additional-resources">Additional Resources</h2> <ul> <li><a href="https://github.com/oohnoitz/nginx-blacklist">Blacklist by oohnoitz</a> which blocks bad bots, pentest tools, surveillance bots, etc. It’s an excellent addition to the referer spam blacklist described in this post.</li> <li><a href="http://perishablepress.com/4g-ultimate-referrer-blacklist/">Ultimate referer blacklist</a></li> </ul>