Generally, web applications shouldn’t sanitize input. Validate the data and either accept or reject it. Give user feedback on why it was rejected. However sometimes there is a grey area where you wish to make the dataset more consistent (email addresses, postal codes, phone numbers) without negatively affecting the user. The input is OK but it needs some reformatting.

In Laravel this can be done in Eloquent model attribute set mutators after validation and before the database query is run. But sometimes that cleanup has to be done before validation.

Laravel 5.0 pre-release had a short-lived FormRequest@sanitize() method to cleanup inputs before validation but a fully-featured sanitizer never it into an official release. Laravel 5.4 later added web middleware TrimStrings and ConvertEmptyStringsToNull to normalize database blank strings. They both use a parent TransformsRequest class that walks through each input in query parameters and the HTTP request body. Well your userland code can re-use that class too!

  1. Add sanitize methods to Laravel Request classes
  2. What is wrong with $request->replace()?
  3. Use Case: unique users.email registration
  4. Use Case: Disallow multi-byte emoji characters
  5. Use Case: Coerce Y-m-d date format
  6. Use Case: Format Canadian postal codes

Add sanitize methods to Laravel Request classes

I have setup two ways to reformat request input before validation:

  1. request()->sanitize($key, $filter);
  2. Use FormRequest::sanitizes() to define a set of filters per input key, similar to how rules() works.

First setup the core middleware class that re-uses TransformsRequest. It has three built-in filters named 'lower', 'trim' (multi-byte whitespace, not just ASCII), and 'upper'.

app/Support/SanitizesRequest.php

namespace App\Support;

use Illuminate\Foundation\Http\Middleware\TransformsRequest;
use Illuminate\Support\Str;

class SanitizesRequest extends TransformsRequest
{
    protected $filters;
    protected $request;

    public function __construct(array $filters)
    {
        $this->filters = $filters;
    }

    public function handle($request, \Closure $next, ...$attributes)
    {
        $this->request = $request;

        return parent::handle(...func_get_args());
    }

    protected function transform($key, $value)
    {
        if (! is_string($value) || ! isset($this->filters[$key])) {
            return $value;
        }

        return array_reduce(
            $this->wrap($this->filters[$key]),
            function ($value, $filter) {
                return ($this->alias($filter) ?? $filter)(
                    $value, $this->request
                );
            },
            $value
        );
    }

    protected function wrap($filters)
    {
        if (! is_array($filters) || is_callable($filters)) {
            $filters = [$filters];
        }

        return $filters;
    }

    protected function alias($filter)
    {
        if (is_string($filter)) {
            return $this->aliases()[$filter] ?? null;
        }
    }

    protected function aliases()
    {
        static $aliases;

        if ($aliases !== null) {
            return $aliases;
        }

        return $aliases = [
            'lower' => [Str::class, 'lower'],
            'trim' => function ($value) {
                // Removes multi-byte whitespace too. e.g., from
                // MS Word copy and paste to <input>
                //
                // I prefer to Str::macro('trim', ...) this Closure.
                $value = preg_replace(
                   '/^[\pZ\pC]+|[\pZ\pC]+$/u', '', $subject
                );

                if (filled($value) {
                    return $value;
                }
            },
            'upper' => [Str::class, 'upper'],
        ];
    }
}

Define request()->sanitize() in AppServiceProvider.php or whichever class you use to setup framework macros. It can sanitize a single parameter or many named keys.

app/Providers/AppServiceProvider.php

namespace App\Providers;

use App\Support\SanitizesRequest;
use Illuminate\Http\Request;

class AppServiceProvider extends ServiceProvider
{
    public function boot()
    {
        Request::macro('sanitize', function ($key, $filter = null) {
            if (is_array($key)) {
                $filters = $key;
            } else {
                $filters = [$key => $filter];
            }

            (new SanitizesRequest($filters))->handle($this, function () {});

            return $this;
        });
    }

    // ...
}

Then allow FormRequest-extending classes to define their sanitizes() method. The all() method is reached early in the FormRequest lifecycle so the first call runs your input filters.

app/Http/Requests/FormRequest.php

namespace App\Http\Requests;

use Illuminate\Foundation\Http\FormRequest as BaseFormRequest;

abstract class FormRequest extends BaseFormRequest
{
    protected $isSanitized = false;

    public function all($keys = null)
    {
        if (!$this->isSanitized) {
            if (method_exists($this, 'sanitizes')) {
                $this->sanitize($this->sanitizes());
            }

            $this->isSanitized = true;
        }

        return parent::all($keys);
    }
}

Now you can define input filters on many parameters.

app/Http/Requests/StoreCouponRedemptionRequest.php

namespace App\Http\Requests;

class StoreCouponRedemptionRequest extends FormRequest
{
    public function authorize()
    {
        return true;
    }

    public function rules()
    {
        return [
            'email' => 'required|email:rfc,dns,spoof|unique:users,email',
            'coupon' => 'required|string|exists:coupon,code',
        ],
    }

    public function sanitizes()
    {
        return [
            'email' => 'lower',
            'coupon' => 'upper',
        ],
    }
}

What is wrong with $request->replace()?

This method only works on the $request->getInputSource() ‘ParameterBag’ object. Query parameters, the HTTP body, and JSON parameters are all stored in a different ‘ParameterBag’ inside Laravel’s Illuminate\Http\Request class. Depending on how the request was built, it may be possible your $request->replace() call has no effect on the inputs eventually fed into the database. Class TransformsRequest works on each ‘ParameterBag’ so you’re guaranteed to not have this problem.

In most cases, replace() is OK. Just don’t mix ?query=params into POST…

Use Case: unique users.email registration

Laravel authorization scaffolding RegisterController / RegistersUser trait currently allows duplicate accounts for the same email address when not using MySQL. “DerekM@Example.com” and “derekm@example.com” both lead to the same inbox but without a case-sensitivity check during your app’s registration, for some databases like Postgres a user may end up with many accounts depending on their SHIFT or CAPSLOCK key on the day.

Adding two Str::lower() controller calls makes the code kinda gnarly.

app/Http/Controllers/Auth/RegisterController.php

namespace App\Http\Controllers\Auth;

use App\Http\Controllers\Controller;
use App\Providers\RouteServiceProvider;
use App\User;
use Illuminate\Foundation\Auth\RegistersUsers;
use Illuminate\Support\Facades\Hash;
use Illuminate\Support\Facades\Validator;
use Illuminate\Support\Str;

class RegisterController extends Controller
{
    use RegistersUsers;

    protected $redirectTo = RouteServiceProvider::HOME;

    public function __construct()
    {
        $this->middleware('guest');
    }

    protected function validator(array $data)
    {
        $data['email'] = Str::lower($data['email'] ?? null);

        return Validator::make($data, [
            'name' => ['required', 'string', 'max:255'],
            'email' => ['required', 'string', 'email', 'max:255', 'unique:users'],
            'password' => ['required', 'string', 'min:8', 'confirmed'],
        ]);
    }

    protected function create(array $data)
    {
        return User::create([
            'name' => $data['name'],
            'email' => Str::lower($data['email']),
            'password' => Hash::make($data['password']),
        ]);
    }

    public function register(Request $request)
    {
        $inputs = $request->sanitize('email', 'lower')->all();

        $this->validator($inputs)->validate();

        $user = $this->create($inputs);

        $this->guard()->login($user);

        return redirect($this->redirectPath());
    }
}

I prefer to reformat UserController to use request()->validate() instead of having an abstraction of many methods. The register() action is very simple.

namespace App\Http\Controllers\Auth;

use App\Http\Controllers\Controller;
use App\Providers\RouteServiceProvider;
use App\User;
use Illuminate\Foundation\Auth\RegistersUsers;
use Illuminate\Support\Facades\Hash;

class RegisterController extends Controller
{
    use RegistersUsers;

    protected $redirectTo = RouteServiceProvider::HOME;

    public function __construct()
    {
        $this->middleware('guest');
    }

    public function register()
    {
        $inputs = request()->sanitize('email', 'lower')->validate([
            'name' => ['required', 'string', 'max:255'],
            'email' => ['required', 'string', 'email', 'max:255', 'unique:users'],
            'password' => ['required', 'string', 'min:8', 'confirmed'],
        ]);

        $user = User::create([
            'name' => $inputs['name'],
            'email' => $inputs['email'], // this is lowercase
            'password' => Hash::make($inputs['password']),
        ]);

        $this->guard()->login($user);

        return redirect($this->redirectPath());
    }
}

Now entering “DerekM@Example.com” will insert into database column users.email as “derekm@example.com”. Likewise the LoginController should be changed to lowercase the ’email’ parameter.

Use Case: Disallow multi-byte emoji characters

Depending how professional you want a product to look (or if you don’t want to irritate certain demographics of your userbase), silently filtering text input of emoji characters can be a better approach over giving validation feedback that is really prompting to trial and error workarounds.

emoji-detector-php has a general regexp string to match common multi-byte characters used for emoji displays. Depending on your location in the world, what is interpreted as an emoji in the browser can widely vary.

app/Support/InputFilters/EmojiFilter.php

namespace App\Support\InputFilters;

class EmojiFilter
{
    public function __invoke($value)
    {
        return preg_replace('/' . $this->patterns() . '/u', '', $value);
    }

    protected function patterns()
    {
        static $patterns;

        if ($patterns === null) {
            // https://github.com/aaronpk/emoji-detector-php/blob/master/src/regexp.json
            // Copy to project repo database/json/emoji-patterns.json.
            $patterns = json_decode(
                file_get_contents(database_path('json/emoji-patterns.json')), true
            );
        }

        return $patterns;
    }
}
$inputs = request()
    ->sanitize('body', new EmojiFilter)
    ->validate(['body' => 'required|string|min:10']);

If the user is a terrible person that only enters emojis, the above call will fail validation from ‘body’ becoming a blank string.

Use Case: Coerce Y-m-d date format

Date inputs still require a custom frontend component, which can become interesting in localized apps that may be displaying many different date formats. This sanitizer will transform input to an acceptable Y-m-d string for the database. Technically it’s validating date inputs twice, but it covers most bot-triggered edge cases.

Fun fact: Laravel’s 'date' validation rule passes for some input that Carbon::parse() / new DateTime fail to interpret.

app/Support/InputFilters/DateFormat.php

namespace App\Support\InputFilters;

use Illuminate\Support\Carbon;
use Illuminate\Support\Facades\Validator;

class DateFormat
{
    protected $format;

    public function __construct($format = 'Y-m-d')
    {
        $this->format = $format;
    }

    public function __invoke($value)
    {
        return optional($this->parse($value))->format($this->format);
    }

    protected function parse($value)
    {
        if (Validator::make([$value], ['required|date'])->passes()) {
            try {
                $timestamp = Carbon::parse($value);

                if ($timestamp->year >= 1900) {
                    return $timestamp;
                }
            } catch (\Throwable $e) {
            }
        }
    }
}
$appointment = $request->user()->appointments()->create(
    $request->sanitize(
        array_fill_keys(['starts_at', 'ends_at'], new DateFormat)
    )->validate([
        'starts_at' => 'required|date|after:today',
        'ends_at' => 'required|date|after:starts_at',
    ])
);

Use Case: Format Canadian postal codes

Allow users to enter any whitespace or letters in lowercase – it’ll be auto-corrected to uppercase with a space between.

app/Support/InputFilters/CanadianPostalCodeFilter.php

namespace App\Support\InputFilters;

class CanadianPostalCodeFilter
{
    public function __invoke($value)
    {
        if (preg_match('/\b([A-Z][0-9][A-Z])\s?([0-9][A-Z][0-9])\b/i', $value, $matches)) {
            return trim(strtoupper("{$matches[1]} {$matches[2]}"));
        }

        return $value;
    }
}
$inputs = request()
    ->sanitize('postal', new CanadianPostalCodeFilter)
    ->validate([
        'country' => 'required|string',
        'postal' => 'required|string', // Maybe run postal validation too.
    ]);

“h3c5l2” will be saved to the database as “H3C 5L2” and so will other postal codes. Sloppy database results begone.