Replacing Potentially Malicious HTML Tags in Input with PHP
Submitted by charlie.collins on Fri, 03/22/2002 - 19:36
Tagged:
Recently I wanted to update the methodology I use for stripping potentially malicious HTML tags from form input on PHP sites. (NOTE* This should be done for ALL Input on ALL sites, but it rarely ever is.)
The deal is that if you allow users to submit forms on your server they could try to exploit the server with some heavy tags such as, APPLET, OBJECT, EMBED or even the PHP delimiter itself! Check this CERT Advisory on the subject for more details.
I had an old function that I had written a few years ago to do this using the PHP builtin str_replace() and some tag names. I decided to test it and update it. I am glad I did.
I had issues my old function (in use for a long while) and with the strip_tags() builtin PHP function. I also tried many of the user contributed examples on the strip_tags() PHP manual page. To my surprise, strip_tags() AND all of the examples allowed me to easily subvert them in testing. The issue is they all appear to BE VULNERABLE TO CASE. That means that the applet tag may be disallowed but aPpLeT is not! (Maybe my implemenation had issues, I dont know, but I WAS able to subvert strip_tags() as well, this is unusual as most PHP buitlins are rock solid, so it may have been anamolous?)
To remedy this I made a function to turn the HTML tags into all upper case (as the example on preg_replace() page shows) and THEN parse the input replacing malicious tags defined in an array with a warning. As follows:
function inputCheck($body)
{
$body=preg_replace("/(]* >)/e", "'1'.strtoupper('2').'3'", $body);
$disallowedTags = array("APPLET","OBJECT","SCRIPT","EMBED","FORM","?","%");
foreach ($disallowedTags as $value)
{
$body=str_replace("<".$value, "WARNING: $value (tag not allowed.)", $body);
}
return $body;
}
Whatever method is used, this input validation NEEDS to be done. Also, you still need to be careful with tag attributes. And my recommendation is to do this at INPUT not at display as some popular PHP packages do. That way the data is clean, not the presentation, in case it is used elsewhere, etc.
I have implemented this on TotSP and in a customized rev of the popular PHP discussion threads package phorum (phorum now disallows all HTML and has special phorum codes for links and bold, etc, I find that too restrictive and not really required, I will submit a patch to phorum to see if a new method might be more well received.)
To try it out post some crap in the phorum by simply responding to this message below. 






Comments
test some crap
Re: test some crap
Re: Replacing Potentially Malicious HTML Tags in Input with PHP
Re: Replacing Potentially Malicious HTML Tags in Input with PHP
Re: Replacing Potentially Malicious HTML Tags in Input with PHP
Re: Replacing Potentially Malicious HTML Tags in Input with PHP
Re: Replacing Potentially Malicious HTML Tags in Input with PHP
Re: Replacing Potentially Malicious HTML Tags in Input with PHP
Re: Replacing Potentially Malicious HTML Tags in Input with PHP