Let’s Talk About Security – Validate All Input

In my last post, Sebastian Safe built a login screen for his locksmith website. In this context, I addressed known security vulnerabilities and possible precautions that can be taken. Only users who are given the respective rights upon registration on the website should be granted access to the system. 

Now, Sebastian is going to make sure that the users can use the required features of the website. The website can upload images, files or texts to the server. The uploaded files are then processed by the system and stored on corresponding server paths. Since he believes that only “selected” users have access to the system, Sebastian fails to take security precautions after the login. However, one has to assume that these users unknowingly have malware that they unintentionally upload to the system. Therefore, every user has to be treated as a potential hazard. Consequently, the data traffic on Sebastian’s website has to be verified and validated. This is where “Validate all input” comes in.

close-up view of a laptop, some networked icons in front
Figure 1: Validate all input; Foto de Fondo creado por creativeart – www.freepik.es

The objective of validation is to verify that the data entering the system do not cause any damage or cause information to be leaked. The data from the external system should be validated as quickly as possible to ensure that they can manipulate Sebastian’s website as little as possible. Even if the external systems are trusted systems, the incoming information has to be validated. This also includes partners whose data is required for processing. There is no such thing as 100% certainty that partner systems have not been compromised. The data should first be validated on the semantic and syntactical level. The file should have the logical structure of the respective file format, and the content should correspond to the required input. The following points should be observed in the validation.

1. Black- and whitelisting

This is the simplest method that Sebastian can implement. For uploading an image, it is important to verify that it is really possible to only upload image formats. It should not be possible to use an image upload field to upload scripts to the system that could subsequently attack it from within. However, it might be possible to upload files that have the outwardly correct format, but contain <SCRIPT> tags. Such files can execute scripts even when the file format has been verified. This method is called cross-site scripting (XSS), which is a type of injection that gives an attacker access to the system. Therefore, it is important to do not only a visual check, but to verify the content of the respective file as well.

2. Limits (min & max)

The value range for the entries and files should also be defined. This does not necessarily refer to uploading a file. It is also possible to send strings that are subsequently stored in the database. Consequently, if a date or number is entered, it is important to check, for example, that it has the correct length. When uploading files, it is advisable to check that the size of the imported file is not in the gigabyte range if it is a simple profile picture. All these considerations regarding minimum and maximum limits are common test scenarios that any professional QA team will observe. A special example of such a size limitation is the “billion laughs attack”. This is an XML file that defines an entity in the header. It consists of several LOL strings that multiply by ten due to the nested invocation.

    <!--?xml version="1.0"?-->
     <!--ELEMENT lolz (#PCDATA)-->
     <!--ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;"-->
     <!--ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;"-->
     <!--ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;"-->
     <!--ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;"-->
     <!--ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;"-->
     <!--ENTITY lol6 "&#038;lol5;&#038;lol5;&#038;lol5;&#038;lol5;&#038;lol5;&#038;lol5;&#038;lol5;&#038;lol5;&#038;lol5;&#038;lol5;"-->
     <!--ENTITY lol7 "&#038;lol6;&#038;lol6;&#038;lol6;&#038;lol6;&#038;lol6;&#038;lol6;&#038;lol6;&#038;lol6;&#038;lol6;&#038;lol6;"-->
     <!--ENTITY lol8 "&#038;lol7;&#038;lol7;&#038;lol7;&#038;lol7;&#038;lol7;&#038;lol7;&#038;lol7;&#038;lol7;&#038;lol7;&#038;lol7;"-->
     <!--ENTITY lol9 "&#038;lol8;&#038;lol8;&#038;lol8;&#038;lol8;&#038;lol8;&#038;lol8;&#038;lol8;&#038;lol8;&#038;lol8;&#038;lol8;"-->
    ]>
    <lolz>&lol9;</lolz>

In this example, the LOL string is uploaded to the system memory 1,000,000,000 times. Depending on the strength of the system’s hardware, reading this file can result in a complete collapse. The quantity and size of the string can even be increased, and several files of this kind could be uploaded simultaneously. In such a situation, it is therefore necessary for the system to terminate the process when a certain size is exceeded in order to protect itself. This is not a security vulnerability that leaks information, but it can be exploited to crash the system. And such a crash can then be used to take other steps to infiltrate our system.

For the testers among you, the billion laughs attack might be an interesting opportunity to test your test systems.

3. Client and server verification

It is important to make sure that the input is validated not only on the client, but on the server as well. In web applications, it is possible to bypass the javascripts by means of a proxy or direct queries to the server. Therefore, a double-sided safeguard is recommended. If the client verifies that the file has to be a JPG, and the file is uploaded only after this verification, verification on the server side could be neglected. However, if the attacker reads out the exact addresses and the structure of this upload query to the server by means of network monitoring tools, they will be able to create their own upload query by means of REST tools and bypass the client-side verification. This way, an attacker would be able to deposit scripts directly on the server, which they can then use to gain access to the server or read out information. Therefore, files have to be verified before and after the upload.

4. Server-side control

Another point to be considered is the determination of the location. It should be determined by the server, not the client. Depending on how much information an attacker has been able to read from the client’s scripts, such information can give them an overview of the server structure, giving them a larger target for the attack. Furthermore, the server should also rename a file upon storing it in the defined storage path. This ensures that any script content in the file that was overlooked and that would access the file’s own name cannot be executed because the file would not exist in this case.

    servertestuploadsTestupload.JPG

    servertestuploadsTE123ST321UPZEZELOAUIUID.JPG

These points can easily be verified by QA by checking the server directories after a test upload and analyzing the upload files located there.

In addition, there is software that scans the servers for malware and examines the uploaded files directly. For testing purposes, malware can be uploaded to the server to identify any such security vulnerabilities. However, such tests should always be agreed in advance with the person responsible for the system.

5. Regular expressions

Creating regular expressions for a specific task is a great tool to eliminate security vulnerabilities. The only characters allowed for the required input are those the system can process. It is not necessary to allow for the entire UTF-16 if only numbers are needed for a post code. This way, you can limit the potential risks for each input field. Again, they should be verified both on the client and the server side. Another important security guideline for regular expressions is NOT to use wildcards.

Simple expression for an email address:

    [a-zA-Z]@[a-zA-Z].[a-zA-Z]

Here, a simple regular expression is used for email addresses, although it can be elaborated to a much greater degree.

Elaborate expression for an email address:

    ^[w-.{}#+&-]{1,}@([da-zA-Z-{}#]{1,}.){1,}[da-zA-Z-]{2,3}$

As email addresses offer a very wide range to choose from, writing a restricting regular expression for them is difficult due to the large number of special characters alone. The good news is, there are numerous frameworks that already have such features and that will help you with the verification in this context.

These are just a few of the issues that ought to be observed to eliminate security vulnerabilities. What is also worth mentioning is that using known frameworks is usually better than devising your own features. On the one hand, these frameworks have been tried and tested and have evolved over time, and on the other hand, they are updated whenever new vulnerabilities are identified.

Known frameworks for input validation:

  • Django Validators
  • FluentValidation
  • Apache Commons Validators
  • Express Validator

The “FluentValidation” framework makes validating regular expressions, strings, etc. much easier for the developer. The framework is structured in such a way that you can use simple and clear verification features for the variables of a specific class that the user enters on the website interface. Sebastian created a small class for his locksmith customers, which he calls “Schluesseldienst”. The registered users enter the required information for this class via a registration form. This information includes the company name, address, email address and credit card number and is used to create the class. But before the class is used to store the information in the database, where the unverified data could cause damage, they are verified by the framework.

public class Locksmith 
{ 
    public int Id { get; set; }
    public string firmname { get; set; } 
    public string address { get; set; }
    public string email { get; set; }
    public string creditcard { get; set; } 
} 



using FluentValidation; 

public class CustomerService : AbstractValidator<Locksmith> 
{ public CustomerService() 
{   RuleFor(Locksmith => Locksmith.firmname).NotEmpty() //no spaces
                                             .Length(1,100); //  stringlength between 1 and 100 
    Rulefor(Locksmith => Locksmith.adress)   .NotEmpty() 
                                            .Length(I,_ ); 
    Rulefor(Locksmith => Locksmith.email) .NotEmpty() 
                                          .Length(_,_n) 
                                          .EmailAddress();// verify emailadress format 
    Rulefor(Locksmith => Locksmith.creditcard).NotEmpty() 
                                              .Length(_,_ ) 
                                              .CreditCard(); // verify creditcard




Locksmith customer = new Locksmith(); CustomerService examiner = new CustomerService(); 
ValidationResult results = examiner.Validate(customer); 
if(! results.IsValid) { foreach(var failure in results.Errors) { Console.WriteLine("attribut" + failure.PropertyName + " failed. Error: " + failure.ErrorMessage); } } 

If the validation produces any errors, they are displayed and documented. This way, the correct content of the information is ensured before the data are stored.

This concludes the second part of my security talk. I hope this blog post gave you an overview of the validation of input in your systems.

This post was written by: