How developer introduced XSS via Django template tag

From time to time I do consulting on Django projects. This time customer wanted me to review large codebase, but focus on security. Most large projects have “do not touch this or sky will fall and you’ll be parsing HTML with regular expressions in hell for eternity” code. It is a big problem: noone except highly motivated person (like me ;) knows how project works. Some developer (not knowledgeable of project’s details) can introduce vulnerability (or vulnerabilities), and it’s a big luck if anyone will notice.

Let’s take a look at sample Django project that consists of three files. Main module (app.py):

from django.conf import settings

if not settings.configured:
    settings.configure(
        DEBUG=True,
        ROOT_URLCONF='app',
        TEMPLATE_DIRS=('.', ),
    )


from django.conf.urls.defaults import patterns
from django.template.base import libraries
from django.shortcuts import render

import tags


libraries['twitter'] = tags.register


urlpatterns = patterns('',
    (r'^$', lambda request: render(
        request,
        'index.html',
        {'link': 'twitter.com/test"></a><script>alert(\'pwned\')</script>'}
    )),
)


if __name__ == '__main__':
    from django.core.management import execute_from_command_line
    execute_from_command_line()

One template (index.html):

{% load twitter %}{% render_twitter_account_link link %}

Module with template tags (tags.py):

import urlparse

from django import template


register = template.Library()


@register.simple_tag
def render_twitter_account_link(url):
    netloc = 'twitter.com'

    if has_scheme(url) or url.startswith(netloc):
        url = ensure_url_scheme(url)

        parsed = urlparse.urlparse(url)
        path = parsed.path.strip('/')
        extracted = path or parsed.fragment.lstrip('!/')

        if (not extracted) or (parsed.netloc != netloc):
            return ''

        name = u'@%s' % extracted
    else:  # '@name' or 'name'
        name = url
        url = u'https://%s/%s' % (netloc, name.lstrip('@'))

    return u'<a href="%(url)s" rel="nofollow">%(name)s</a>' % {
        'url': url,
        'name': name,
    }


def ensure_url_scheme(url):
    if has_scheme(url):
        return url
    return u'http://%s' % url


def has_scheme(url):
    return url.startswith(('http://', 'https://'))

If you will run python app.py runserver and go to http://127.0.0.1:8000/, you’ll get JavaScript alert saying that you’re “pwned”.

Let’s see what browser got:

<a href="http://twitter.com/test"></a><script>alert('pwned')</script>" rel="nofollow">@test"></a><script>alert('pwned')</script></a>

You noticed that link context variable contains twitter.com/test"></a><script>alert(\'pwned\')</script>. I hardcoded it, just to make things clearer. That link is passed to template tag: {% render_twitter_account_link link %}, and, as it starts with twitter.com, name (look at template tag’s code) will contain @test"></a><script>alert('pwned')</script> and url will contain http://twitter.com/test"></a><script>alert('pwned')</script>.

Developer of this template tag forgot what Django documentation says:

The output from template tags is not automatically run through the auto-escaping filters.

Simplest fix requires us to escape template tag’s input as soon as we can:

import urlparse

from django import template
from django.utils.html import escape


register = template.Library()


@register.simple_tag
def render_twitter_account_link(url):
    netloc = 'twitter.com'

    url = escape(url)

    if has_scheme(url) or url.startswith(netloc):
        url = ensure_url_scheme(url)

        parsed = urlparse.urlparse(url)
        path = parsed.path.strip('/')
        extracted = path or parsed.fragment.lstrip('!/')

        if (not extracted) or (parsed.netloc != netloc):
            return ''

        name = u'@%s' % extracted
    else:  # '@name' or 'name'
        name = url
        url = u'https://%s/%s' % (netloc, name.lstrip('@'))

    return u'<a href="%(url)s" rel="nofollow">%(name)s</a>' % {
        'url': url,
        'name': name,
    }


def ensure_url_scheme(url):
    if has_scheme(url):
        return url
    return u'http://%s' % url


def has_scheme(url):
    return url.startswith(('http://', 'https://'))

Note what we did: first we imported escape function via from django.utils.html import escape, then we began escaping url (url = escape(url)) as soon as possible, i.e. before we use it in code.

Now browser gets proper link:

<a href="http://twitter.com/test&quot;&gt;&lt;/a&gt;&lt;script&gt;alert(&#39;pwned&#39;)&lt;/script&gt;" rel="nofollow">@test&quot;&gt;&lt;/a&gt</a>

Note metadata

Publication date

August 7, 2012

Tags

About

This blog is about things I encounter while doing web and non-web software development.

Recent notes

All notes