Just another WordPress weblog
RSS icon Email icon Home icon
  • ESCAPE OUTPUT Part 2

    Posted on March 28th, 2009 admin No comments

    Escaping HTML

    There are three main functions in PHP for escaping HTML: htmlentities(), htmlspecialchars(), and strip_tags(). In the case of strip_tags(), no special characters are actually escaped, but, instead, all HTML tags are removed. Using this function with no extra parameters is probably

    one of the safest ways to completely remove all HTML tags from output. I have seen other user-defined functions that attempt to do something similar by removing all but a set of allowed tags, but these are not without their flaws and can potentially introduce some nasty bugs that are too lenient when outputting data. Likewise, strip_tags() offers the option to allow certain tags

    with the format strip_tags($str, ‘<p> <a> <b>’);, but this is also too lenient: attributes are not stripped from allowed tags, allowing onclick events, etc. to persist in output. Take the following code snippet, for

    example:

    $str = ‘<p><b>Bold text</b>

    <a href=”#” onclick=”alert(\’XSS\’);”>Link</a>

    <img src=”example.png”/></p>’;

    echo strip_tags($str, ‘<p> <a> <b>’);

    This code will output the following, complete with the cross-site scripting (XSS) in the onclick attribute:

    <p><b>Bold text</b>

    <a href=”#” onclick=”alert(‘XSS’);”>Link</a></p>

    Rather than completely stripping the tags fromoutput, a better alternative may be to escape all the tags,allowing them to render in the output. This is an easytask with htmlspecialchars() and htmlentities(). Both of these functions serve the same purpose: to convert special characters into their equivalent HTML entities. The main difference is that htmlentities() is more exhaustive, choosing to convert all characters with HTML character entity equivalents to their respective

    HTML entities. Thus, for its exhaustive nature, I will recommend htmlentites() as the better unction to use to escape HTML output. For the above $str example, htmlentities() returns the following:

    &lt;p&gt;&lt;b&gt;Bold text&lt;/b&gt;

    &lt;a href=&quot;#&quot;

    onclick=&quot;alert(‘XSS’);&quot;&gt;Link&l

    t;/a&gt;

    &lt;img src=&quot;example.png&quot;/

    &gt;&lt;/p&gt;

    In this case, however, allowing the <b> tags may be preferable, and so we can allow them by first escaping the output and then converting the selected HTML entities back to HTML with str_replace():

    $str = htmlentities($str);

    $str = str_replace(‘&lt;b&gt;’, ‘<b>’, $str);

    $str = str_replace(‘&lt;/b&gt;’, ‘</b>’, $str);

    This will ensure that we send only those special characters that we desire to have interpreted to the client. While this is a form of unescaping, which I mentioned earlier is not a desirable process, it is nevertheless a good alternative to using strip_tags() to allow certain tags, as it will ensure that any tags that contain undesirable attributes are not interpreted by the client.

    In addition, there is no guesswork involved here; I am not using a regular expression that I could potentially get wrong and, thus, introduce a hole in my application. I will always know what a <b> tag looks like after the angle brackets have been converted to their HTML entity equivalents, so it is easy for me to find and convert the tags back to HTML.

    Escaping SQL

    Similarly, PHP offers excellent built-in functions for escaping SQL statements according to the database engine used. For PostgreSQL, there is pg_escape_string() for MySQL, mysql_real_escape_string() and for SQLite, sqlite_escape_string(). If the other native database

    functions provided in PHP do not offer a similar function, then PHP offers addslashes(), though I would advise that the database’s native escape string function is always a better alternative than addslashes(). Using the SQL example from earlier, we can escape it using ysql_real_escape_string(), as shown in Listing 1, where we first filter it using the filter() function Igave in the August 2005 issue. Thus, if a user enters the value “example’ OR 1 = 1; –” as a username, the SQL that is executed will be:

    SELECT * FROM users

    WHERE username = ‘example\’ OR 1 = 1; –‘

    AND password = ‘password’

    The single quotation mark is escaped and no results are returned because this user doesn’t exist—the user can’t gain access to the application. Some database functions, such as the unified ODBC

    functions, mysqli, and PDO (in PHP 5.1), use the concept of prepared statements to prepare and properly escape an SQL statement. Listing 2 illustrates a prepared statements example using PDO. The SQL statement that is created will appear much like the one listed above, but PDO offers added functionality through the optional bindParam() parameters to define the type and length of data. Prepared statements also exist in PEAR::DB and other database abstraction classes, but PDO offers much promise since it is built into the language and, thus, much faster with less overhead. So, if possible, use prepared statements (with PDO, if possible). If they aren’t available, use the database’s built-in escaping function. If that isn’t available, then

    fall back on addslashes() as a last resort.

    A Security-Conscious Mindset

    The key to secure programming is having a securityconscious mindset. Filtering input and escaping output is just part of that mindset, but it takes more thought than simply copying code from elsewhere to introduce security to an application. It takes careful planning and diligent testing. By now, I hope that you are well on your way to being a security-conscious programmer. I have introduced some tools and concepts to help you get started, and it is likely that you have thought of code you’ve already written and how to improve it using these principles. So, have fun, good luck, and be sure to keep security at the forefront of a project. Security is not a design

    feature—it is an essential tool.

    Leave a reply