Elliott Brueggeman - PHP and Web Development Info, Photography, and More
 
Home | Web Dev Blog & Articles | PHPGraphLib | PHPSimpleChat | SkinnyTip | PHPWeatherLib | Photography | Contact
Posted on December 8, 2008 in MySQL, PHP by Elliott BrueggemanNo Comments »

Security is an important part of PHP programming, and PHP provides several tools for securing database queries and HTML display. However, knowing which function to use and when to use it can be somewhat confusing, as there’s many details to pay attention to. It’s important not to leave your website open to cross-site scripting or SQL injection attacks.

This example and explanation will focus on a common web scenario – a user submitting data via a form, being asked to confirm it, and then being displayed the data after it is stored in the database.

1. Initial User Input

In this scenario, a user will be presented an input box for which they can add a review about a particular product on a public facing website. The user is not allowed to use html, similar to the restrictions on product reviews on Amazon.com. The user inserts their review into the input box and then clicks submit.

2. User Input Preview and Cross Site Scripting Prevention

Now, we want to show the user their review and have them confirm it before we add it to the database. This is a prime example of a situation that leaves a website open for a XSS (cross-site scripting) attack. We don’t know what the user entered on the previous page, and they could be purposely entering malicious content. Because what they entered in going to be output to the screen, if they input raw PHP code, it could get executed unknowingly by our server. To prevent this, we need to html encode all potentially dangerous characters for display on the screen.

I would recommend using htmlentities() to do this, and use the ENT_QUOTES option as shown below, which means that both single and double quotes will also be encoded. Because we also decided that we wouldn’t allow HTML, we’ll want to strip that HTML before displaying the results, to give the user an accurate preview. Before doing any of this, we’ll have to grab the submitted data from the submitted form – this tutorial won’t cover this, but there are plenty of web tutorials available for doing this.

<?php
 
//strip HTML tags from input data
$input_data = strip_tags($input_data);
 
//turn all characters into their html equivalent
$preview_data = htmlentities($input_data, ENT_QUOTES);
 
//...display $preview_data 
 
?>

3. Database Insert and SQL Injection Protection

Now, let’s say the user previews above data, and clicks accept, which will send this data to another script for entry into the database. Protecting your database from SQL injection requires different steps than protecting against cross-site scripting.

It is very likely that you don’t want to store the user’s data in HTML encoded form. Let’s say you are using a varchar(32) column in your database, and you had an input box that was 32 characters long. If you were trying to store the HTML equivalent of this in your database, you would need a column that was much larger to guarantee that data didn’t get lost. This is because the HTML equivalent of a single character is often many characters long. For example, a double quote (”) becomes ", and an ampersand (&) becomes &. We need to guard against single quotes, because these can cause an SQL injection problem, depending on how your database is setup. Consider this example:

<?php
 
$name = "George'); DELETE FROM mytable; INSERT INTO mytable (name) VALUES ('you got hacked";
 
$sql = "INSERT INTO mytable (name) VALUES ('$name')";
 
//...run the $sql query
 
?>

If your SQL server allows more than one SQL command on a single query request, you’ve just lost lots of data. While it’s a good idea to lock down your database server so it doesn’t allow this, you always want to secure your code independently of the database, in case you change your hosting setup later.

To secure your script, you can use the addslashes() function which will escape both single and double quotes, by adding backslashes before them, to prevent multiple queries from being executed. You’ll also want to test for the length of the input field, as forged form requests are easy as can be using tools like Firebug. Here is example code that will accomplish that:

<?php
 
//escape trouble characters
$name = addslashes($name);
 
//make sure not longer than expected length
$name = substr($name, 0, 32);
 
$sql = "INSERT INTO mytable (name) VALUES ('$name')";
 
//...run the $sql query
 
?>

Note that there is the possibility of valid user input being cut off when being inserted into the database. If the user used all 32 characters in the input box, and a single quote was one of them, then the above code would trim off the last character. You may want to prevent this by using a 28 max character input box for a varchar(32) column, giving the user 4 opportunities to use an escaped character without having their input cut off.

4. Retrieving and Displaying the Data From the Database

Now that the data is safely in the database, everything is safe right? Wrong. We elected not to store HTML encoded data from users. This means that we are still at risk from a Cross Site Scripting attack every time we query and then display the data. What you’ll probably want to do is create a function that sanitizes database data before being displayed on the screen. No only should you HTML encode the data, but you’ll also want to remove the backslashes you added earlier, as these aren’t meant to be displayed. Here’s what your function could look like:

<?php 
 
function sanitize_data($input_data) {
  return htmlentities(stripslashes($input_data), ENT_QUOTES);
}
 
?>
Posted on May 14, 2008 in PHP by Elliott Brueggeman6 Comments »

Last November, I published a post on creating a simple PHP Captcha. Some readers noted that I actually used a more advanced captcha for my own site’s discussion pages, and asked for the source code. I’ve actually gone a step further and improved my own captcha by adding more background noise, decreasing the likelihood that a bot would be able to break it. Though harder for a bot to read, I changed the font and made it easier for a human to read.

As noted before, Captchas are images that are meant to tell the difference between a person and a computer, be presenting an image with text in it, and asking the user to enter the text and submit it with the form. Captchas are used to reduce spam on publicly accessible message boards and comment lists.

Below is the new and improved PHP captcha script and example implementation.

Requirements

To use this script, you will need PHP enabled web server with the Image (GD) extension enabled. If you are not sure, check with your hosting company. Most major hosting companies, like GoDaddy.com provide this by default.

Setup

On the page where your users are submitting content, like a post or a comment, you want to display the captcha and an input box for the user to enter the text displayed in the captcha. To use the PHP script below to you need to include it as an image on the page you want it to display – save the script as something like “advanced_captcha_script.php” and use the following html to display it on the page.

Include this html code inside your form tag:

<!-- display the script as an image --> 
<img src="advanced_captcha_script.php" /> 
<!-- an input box to input the captcha text -->  
<input name="captcha_input" id="captcha_input" />

The Script

<?php
session_start(); //MUST START SESSION 
$string_length = 6; //NUMBER OF CHARS TO DISPLAY 
$large_letters = array('m','w');
$rand_string = ''; 
 
for ($i=0; $i<$string_length; $i++) { 
  //PICK A RANDOM LOWERCASE LETTER USING ASCII CODE 
  $rand_string .= chr(rand(97,122)); 
}
 
//IMAGE VARIABLES 
$width = 100; 
$height = 36; 
 
//INIT IMAGE 
$img = imagecreatetruecolor($width, $height); 
 
//ALLOCATE COLORS 
$black = imagecolorallocate($img, 0, 0, 0); 
$gray = imagecolorallocate($img, 110, 110, 110); 
$medgray = imagecolorallocate($img, 180, 180, 180); 
$lightgray = imagecolorallocate($img, 220, 220, 220); 
//FILL BACKGROUND
imagefilledrectangle($img, 0, 0, $width, $height, $lightgray); 
 
//ADD NOISE - DRAW background squares
$square_count = 6;
for ($i = 0; $i < 10; $i++) {
  $cx = (int)rand(0, $width/2);
  $cy = (int)rand(0, $height);
  $h  = $cy + (int)rand(0, $height/5);
  $w  = $cx + (int)rand($width/3, $width);
  imagefilledrectangle($img, $cx, $cy, $w, $h, $medgray); 
}
 
//ADD NOISE - DRAW ELLIPSES
$ellipse_count = 5;
for ($i = 0; $i < $ellipse_count; $i++) {
  $cx = (int)rand(-1*($width/2), $width + ($width/2));
  $cy = (int)rand(-1*($height/2), $height + ($height/2));
  $h  = (int)rand($height/2, 2*$height);
  $w  = (int)rand($width/2, 2*$width);
  imageellipse($img, $cx, $cy, $w, $h, $gray);
}
 
//REPLACE THIS WITH THE FONT YOU UPLOAD 
$font = 'Tuffy.ttf'; 
$font_size = 18; 
 
//CALC APPROX LOCATION - CUSTOMIZED FOR ABOVE FONT 
$y_value = ($height/2) + ($font_size/2); 
$x_value = 0;
 
//DRAW STRING USING TRUE TYPE FUNCTION 
 
for ($i = 0; $i < $string_length; $i++) {
  $chr = substr($rand_string, $i, 1);
  $x_value += 3 * ($font_size/5); 
  imagettftext($img, $font_size, 0, $x_value, $y_value, $black, $font, $chr); 
  //check to see if larger than normal letters, if so add more horiz space
  if (in_array($chr, $large_letters)) {
    $x_value += 4;
  }
}
 
$_SESSION['encoded_captcha'] = md5($rand_string . 'my_secret_key'); 
 
//OUTPUT IMAGE HEADER AND SEND TO BROWSER 
header("Content-Type: image/png"); 
imagepng($img); 
?>

Here’s an example of what the captcha looks like (refresh the page to generate a new captcha):

Script Notes

We are using a font called “Tuffy.ttf” that is in the public domain in this script. You will need to download this font and upload to the same directory as the captcha script. You can download this font from the homepage of its creator or directly from my site.

Checking The Result

After the user has submitted the form, we still have to validate their captcha text entry. Without validation, our captcha security is useless. The general flow of things after the form has been submitted is:

  1. Compare the user entry to the captcha text
  2. If they are the same, enter the user’s content into your database
  3. If they are different, display an error message to the user.

If you examine the PHP script above you will notice that we are storing the generated captcha as a session variable concatenated with a secret key. If we use the PHP function session_start() on the next page after the form has been submitted, we will be able to access that stored session variable. We do not store it as plain text, or else a spam bot might be able to access the stored text in the session variable and then type the correct captcha text in the input box, bypassing our security. Instead, we store it a an md5() hash, which is a one-way encryption method. All md5() encrypted strings by their nature cannot be decrypted, so to validate the user entry we need to md5() encrypt the user’s input, concatenate in the secret key and compare it to the already encrypted session variable. Below is some PHP code that shows how we could validate the captcha:

<?php
session_start(); 
$user_content=trim($_POST['user_content']); 
$user_captcha_input=trim($_POST['user_captcha_input']); 
if (isset($_SESSION['encoded_captcha'])) { 
  if ($_SESSION['encoded_captcha'] == md5($user_captcha_input . 'my_secret_key')) { 
    //..THE USER IS NOT A BOT, STORE THEIR INPUT
  } 
  else { 
    //THE USER ENTERED THE WRONG CAPTCHA, 
    //DISPLAY AN ERROR MESSAGE AND 
    //DO NOT STORE THEIR INPUT
  } 
} 
?>