Output Encoding

How does Output Encoding help?

Values received from a data source should only be treated as what it should be (only as data) and nothing else. This means that in an event where something maliscious is passed into a relevant resource, no rendering or unintenional event triggers will happen.

Output Encoding Methods

I. Escape Characters

A string usually beginning with a backslash ("\") followed by a certain encoding or character forcing it to be interpreted differently.

EXAMPLE: SQL Injection

Make sure that strings are not injectable

Imagine the input, ' OR 1=1; --, is passed to a dynamic query using SELECT:

SELECT * FROM users WHERE id='' OR 1=1; --';

A valid query is returned instead of the following where the whole input is treated as a string and nothing else.

SELECT * FROM users WHERE id='\' OR 1=1; --';

II. HTML Entities

A string that begins with an ampersand ("&") and ends with a semicolon (";"). This is used when generating content with multiple encodings and is particularly useful in displaying reserved characters that are otherwise would be invisible.

EXAMPLE: Cross-Site Scripting (XSS)

An unescaped “<” and “>” could potentially be dangerous

This wouldn’t be rendered in a page with content types — text/html, text/xml, image/svg+xml:

&lt;script&gt;alert(1);&lt;/script&gt;

As opposed to the following which leads to a Cross-Site Scripting (XSS) vulnerability.

<script>alert(1);</script>

III. Data Serialization

This is the process of transforming objects into a data format that can be restored later usually for storage or for transmission of data. It is highly dependent on what programming language you are using since the translation of objects is into what it understands.

EXAMPLE: The very old node-serialize module

The danger usually lies when users are able to pass serialized input and then it is deserialized. This section of this post doesn’t really focus on encoding as much as making sure that data types are strictly set for each parameter value.

Consider this code block where an object with a function that executes the system command, id, is serialized:

var poc = {
  command_execution : function(){
    require('child_process').exec('id', function(error, stdout, stderr) {
      console.log(stdout)
    });
  },
}
var serialize = require('node-serialize');
console.log(serialize.serialize(poc));

It would return the following (with \n and some <spaces> cleaned) serialized data. An ”()” is added at the end to enable the function to be immediately invoked:

{"command_execution":"_$$ND_FUNC$$_function(){ require('child_process').exec('id', function(error, stdout, stderr) { console.log(stdout) }); }()"}

Now, what would happen if the above data is deserialized?

var serialize = require('node-serialize');
var payload = '<the serialized data from earlier>'
serialize.unserialize(payload);

After deserialization, the following is returned:

$ node deserialize.js

  uid=1000(jebidiah) gid=1000(jebidiah) groups=1000(jebidiah),4(adm)

The command, id, was successfully executed. The server could now be exploited via remote command execution which is a very critical vulnerability.

TAKEAWAYS

  • This goes hand in hand with Input Validation.

  • When using open source libraries, always check for vulnerabilities and security fixes to make sure that they are always well-maintained.

  • Always be wary of parameters defined by users externally in order to avoid insecure deserialization of objects from malicious sources.

  • Make sure that deserialization is done in a secure channel or low privilege environments as much as possible. This makes lateral movement a bit harder to do.

Last updated