The next thing that we’re going to talk about is data sanitization in WordPress. This is something that we’re going to talk about next week also when we work on creating the theme options page. It’s not a trivial thing to consider, to plan for and to use, it’s not very often well-discussed either but it’s an absolute essential part of creating a secure child theme and a secure widget.
Really, any time you are taking in input from anybody, you need to make sure that you have properly sanitized that information.
What is Data Sanitization?
Data sanitization essentially prevents malicious code from acting in the database or on the website. The most common place that malicious code enters is through some kind of a form. The thing about malicious code is it doesn’t have to respect the form elements that you use in your form, so you might use a checkbox, a radio button and you may think to yourself, those things don’t need to be sanitized because they only take a 0 or 1, a true or false or an on or an off.
In fact, the malicious code can exploit those and can inject malicious code into those if they are not protected by sanitization. There isn’t any data input, whether it’s a checkbox, a radio button, a color picker, a file uploader that you, as the author of your child theme or the author of your widget will allow anybody to enter that isn’t sanitized. You should now allow anything to be unsanitized.
Approaches to Data Sanitization
There are a variety of approaches to data sanitization. We’re going to talk about four of them here today.
Remove Code Functionality from Data
The first one is removing code functionality from the data. That essentially means taking something that has code and changing it so that its code can’t act. The code is still there but it can no longer act.
The next thing is a whitelisting data where you say, what the possible range of choices are and that’s what you would do with the checkbox for example. In a checkbox, you’re going to whitelist the data and it can either be checked or not checked, it can either be a 0 or 1 and it can’t be anything else. If ever it’s anything else, you change it to a 0 or 1.
Secure Access to Data
The third thing is securing access to data which means that you check to see whether or not the submitter has permission to edit the data. If they have permission to edit the data, then you treat them differently than if they don’t have permission to edit the data.
The final one we’ll talk about today is typecasting. If you’ve been involved in programming languages, you’ll understand that there are data types in most languages and PHP is not considered a strongly typed language so if something wants a number, it can either be an integer or a floating decimal point number or a string and any of those data types will work in adding. For example, you can take integer 3 and add it to the string 5 and the result will be 8 and it’ll probably be an integer but might be a string.
In any case, it does take data types but it’s not a strongly typed language. However, if the only correct value for something is an integer, that is it’s a round, whole positive number, then you can essentially say the only appropriate value for this is an integer and that’s called typecasting or you cast the data type into the result.
PHP will convert anything into an integer so any combination of characters can be converted into an integer by PHP. Once they’re converted in an integer by PHP, it renders any malicious code harmless because it’s no longer malicious code, it has been changed into an integer. We’ll do some typecasting here today as well.
Where to Sanitize the Data
There are two places in this widget where we are going to sanitize data. The first place is inside the form and we’re going to be sanitizing the title, this checkbox and the message. The first kind of sanitization we’re going to do is called, Escape Attributes or esc_attr.
The way that works here is by wrapping that in parenthesis then we use the function esc_attr. If we go look at esc_attr, what it does is it encodes the greater than, the less than, an ampersand, double quotes and single quotes. Essentially what that means is it adds escape characters to each one of these things so it can no longer act as code.
That’s what happens here first and that’s what we’re doing here with this, we’re just doing the escape attr for that and that just strips all those things out of the title.
Using A Ternary Operator
Now the show_title is going to be different. The show_title, what we’re going to do at this point is either give it a value of checked or a value of nothing. The way we’re going to do that is with what’s called a ternary operator. Essentially, you could do this with an IfL statement but it’s easier to put on one line without an IfL statement.
A ternary operator is essentially an IfL statement and it says, if there is a show_title then its value is either to checked=”checked” and if it’s value is not equal to checked=”checked”, then it’s equal to blank, essentially nothing, empty single quotes.
It’s an IfL statement, if $instance show_title[‘show_title’] equals checked=”checked” then it equals checked=”checked” else, it equals blank and this is a way of whitelisting it so if there was some falsified data that’s put in there, it’s either resulting in either checked=”checked” or it’s going to equal to blank. This is where that checked thing happens here. Remember I said that I was deleting checked here for the moment so that we can simplify this?
Values of Checkbox
I want to come over here and go back to w3schools and look at type=”checkbox”. Type has all these different versions type, type=”checkbox” which has the potential attribute of checked. Unfortunately, I don’t see it here but what checked does is it sets the initial value for the checkbox. If you always want the initial value of the checkbox to be checked, you would say, checked=”checked” right here inside the opening input tag. That will automatically make the default value to be checked.
We’re going to return the actual condition to the form. What we’re doing here is this, instead of checked=”checked” what we’re going to do is echo show_title. Notice how we say show_title is equal to the $instance of show_title and if that’s equal to checked=”checked”, that’s what’s equal to and if it’s not to that, it’s equal to blank? What this is going to do is return either checked=”checked” or it’s going to return nothing.
If it returns nothing, the checkbox doesn’t show up just like I just demonstrated without this. If it’s checked or if show_title has a value then it’s going to return checked=”checked”.
I may have described this incorrectly when I talked about the ternary operator. What this is, is this says, if there is an $instance of show_title, then its value is checked=”checked”, else its value is blank. For show_title, if there’s an $instance of show_title, then whatever that $instance happens to be converted to, checked=”checked”. If it doesn’t exist, then show_title is equal to blank. That’s what’s happening here.
What we do down here then is to echo show_title. It’s either going to echo blank or it’s going to echo checked=”checked”. I’m just going to show you what it would look like otherwise because I feel like I didn’t describe that very well. What you would do is, this is essentially the equivalent, if( $instance[‘show_title’] ) then $show_title = ‘checked=”checked” ‘ else, $show_title = blank (‘ ‘). This $show_title = $instance[‘show_title’] ? ‘checked=”checked” : ‘ ‘ that is the same thing as this right here.
If there is a value, $instance[‘show_title’] then $show_title = ‘checked=”checked” ‘ so it automatically replaces whatever happens to be there with checked=”checked” else, $show_title = ‘ ‘. We’re not going to do that though, we’re going to continue to use the ternary operator because it’s quite a bit less code and it’s nice sysSync and we’re going to use this over and over again for checkboxes.
Using Textarea Function
Now, the next thing we’re going to do is fix our message here. For the first part of fixing our message we’re going to use the function called esc_textarea which is a WordPress function that allows HTML characters to be part of the process. It essentially allows you to use the standard HTML tags and encode the standard HTML tags inside of a text area text input.
It’s a relatively new data sanitization and definitely makes sanitizing a lot easier because if something is not part of this sort of ordinary allowed HTML entities, then it ends up escaping it. That’s what’s happening here, esc_textarea($instance[‘message’]). We’ve got our form instantly sanitized.
Sanitize the Update Using Strip_tags
Now we need to sanitize our update. In sanitizing our update, what we’re going to do here is clean it up even more. For our title, we’re going to take out all HTML tags period. We’re going to use strip_tags as the function.
If we come over here and look at strip_tags, this is really a PHP function. The strip_tags just strips all HTML and PHP tags out of that string. Up here, we were escaping it, we were including escape characters in that but down here, what we’re going to do is strip all HTML and PHP out of it period so that there is no HTML or PHP tags allowed inside of this title.
Now, show_title is again going to get this ternary operation and it’s this, $instance[‘show_title’] = $new_instance[‘show_title’] ? 1 : 0; if there is something in show_title, no matter what that something is, in this case it’s going to be checked=”checked” if it’s checked, replace that with a 1 and if there is nothing replace it with a 0. That’s essentially checked and unchecked.
What we’ve done up here is we have taken this 1 and converted it to checked=”checked” to be used in our input function but once we’ve input it, we want to bring it back to its allowable value and so no matter what its value is, if there’s any value entered at all, it’s going to be a 1. So to be checked and if there’s no value in it, then it’s going to be a 0. In a thing like this, if there’s a 0 in it, $instance[‘show_title’] is going to equal false. If it was a minus 1, it would be true but 0 evaluates to false in this case.
Ability to Enter Unfiltered HTML
Our message is going to get a whole different treatment. This is the new code we’re going to be using. Instead of $instance[‘message’] = $new_instance[‘message’], we are going to do something different and that is, we’re going to say, if the current user can and a key ability is unfiltered HTML. That is one of the WordPress’ built-in key abilities and an Admin can create unfiltered HTML.
Essentially what we’re asking here is, if the user role has the user ability of entering unfiltered HTML, then we’re actually not going to alter the content at all, $instance[‘message’] will equal $new_instance[‘message’]. However if that’s not the case, then we’re going to really sanitize the daylights out of the message.
Use Filters to Sanitize the Data
To sanitize the daylights, we’re going to use 3 different filters for this. The first thing we’re going to do is take our new $instance[‘message’] and add slashes to it. What that does is that takes a whole bunch of different what could be HTML or PHP special characters and adds slashes to them which essentially renders it incapable of acting like HTML or PHP.
Then we’re going to use this filter called, wp_filter_post_kses and this kses stands for kses strips evil scripts. WordPress has this filter setup for post, title and for a bunch of different things. It got this different definition of strips evil scripts and you can use any one of their kses filters.
In this case, we’re going to use the post_kses filter and what it does is it has a set of scripts that are allowable and a set of tags that are allowable inside of the post and it strips everything else out. The first thing we do is add slashes to it, second thing we do is use this filter to strip out all the evil scripts.
The third thing we do is strip the slashes back out because now we have a nice clean sanitized set of code so we can take those slashes back out of it. Rather than being one filter, we’re compounding the filters. We’re using the addslashes filter first then we’re using the post_kses filter and we’re stripping the slashes out that we put back in. That is super sanitizing the text here.
In this case, we’re going to assume that if they have the ability to add unfiltered HTML then we’re going to let them add unfiltered HTML to the message. If they can’t then we’re going to neuter that message entirely and we’re going to return this instance. That’s your update and that’s all the data sanitization.
What We Have Sanitized
We have sanitized the title by escaping its attributes, we’ve sanitized it by whitelisting the checkboxes, we’ve used WordPress’ esc_textarea to clean the message and when we go to update the database, we have intensified that by cleaning that up much more.
This prevents malicious script from running in the HTML file, this prevents malicious script from being stored in your database that’s why we do it on both places. That’s data sanitization. Now if we save this and we upload it, come back over to here then refresh this, if we display the widget the title and hit save, you can see it actually saved that “Display the widget title”.
It no longer lost that checkbox and if we refresh this, it respects what’s been checked, we uncheck and save it, refresh and it’s gone. Let’s fix that title here. Remember that title is up in this form and the label is label for this variable. Actually, what we’ll say is, “Enter some text”, save, upload, double check and refresh. Now it says, “Enter some text” so we’ve got our label set up. We’ve got these things working correctly and this Example_Text_Widget is working properly.