facebook youtube pinterest twitter reddit whatsapp instagram

A Faster Router System in PHP - Part 2 (Improvement & Benchmarks)

This guide would build on part one of the Faster Router in PHP.

Part one: https://tonics.app/posts/ff9af70984746b91/faster-router-php

Note: You can jump to the benchmark section if all you want to see are the results, however, It would be a good idea to read at least the Potential Performance Improvement section to get an idea of why TonicsRouter might run super fast in some scenarios.

What's Up In Part 2?

I'll provide answers to common questions I got in the first part, share with you some terrific improvements I made to the tree algorithm when matching a route which can be about 3 thousand times faster or more depending on the route we are matching, and then we benchmark the TonicsRouter with a couple of major PHP routers.

I would like to point out that, I am not in the business of promoting my libraries, please use whichever one you want, if you learn one or two things from my libraries, fine.

I am only sharing

  • In the hope someone would find it interesting,
  • As a road-to-release guide to TonicsCMS (my CMS) and how it works,
  • So I can come back for a review (Part 1 made the Part 2 improvement possible)
  • and most importantly so I can possibly get a job or freelance opportunity, someone might find it worthy, you know.

Let's move on to...

Answer To...

My answers to...

Why Not Check Other Router Libraries Before Creating a New One?

This is the reason I created this router, I checked.

I just do not need most of those features, for example, most Router provides a way to match a custom regex, I don't need that, though with the tree implementation, I can easily add that feature if I want, in short, you can do tons of complex customization by introspecting a certain node, e.g, by matching Date and Time, SSN, etc.

I don't need that, I am only interested in matching a static and a dynamic route, I'll do any validation in the Controller if there is a need to.

Do We Really Care About 1sec Difference Over 10k Requests?

To be honest, it doesn't matter in most cases.

However, there is obvious optimization you can't overlook, like the one I described in my Potential Performance Improvement section, aside from the fact that it is faster, it is also useful outside of routers in general and can be applied to other related issues, I learned from it.

If there is a difference of 1sec every 10k request, then there is a difference of 167 minutes (2.78 hours) per 10 million requests, it goes from there, I know that is just an example.

I won't waste my time debating which to use between for loop and foreach, but if I can discover a solution that would make me avoid unnecessary traversing, I'll do it in a heartbeat.

As an aside, I don't optimize upfront, I follow this pattern:

  1. I can either write a complete 
  2. Or use an existing solution,
  3. I use it for a while (e.g, the TonicsRouter was written circa 2021, and I did use a couple of other existing ones before then)
  4. If I detect potential performance improvement, apply improvement, and run benchmarks. Go to 3.

Is This Faster Than Symfony and FastRoute?

The idea was since Laravel is using the Symfony router under the hood, the benchmark should cover both, however, it isn't fair since Laravel has to call a couple more functions before getting to the Symfony matcher.

I have dropped Laravel off the benchmark list. Since it is using the Symfony Router, let's go to the source directly, cool?

What about FastRoute?

I haven't heard about FastRoute. I have added it to the list since a lot of people made mention of it.

Now, to the question, is it Faster than Symfony or FastRoute?

Using Symfony directly turns out to be fast, is FastRoute or SymfonyRouting faster than TonicsRouter, you would see the result in a moment ;).

You can also try to reproduce this and come to your own conclusion.

Potential Performance Improvement

I did a review of the first part, and I discovered that I can optimize this further, it won't make a difference most of the time but when it does, it's huge.

Let's see further optimization opportunities...

Node Teleporting

This is the most interesting part of my recent performance discovery, and this was possible because of the benchmark I did in Part 1, the "How Fast Can You Match Deep Dynamic Levels" where I repeated the placeholder 1k times.

Introduction

The problem with matching routes in the tree is that the longer the path, the higher the time it would take to match a route.

Take a look at the below case:

$dynamic = str_repeat("/:id", 1000);
$route->get("/post$dynamic", function (){});

If you have a route that long, oh boy, it would have to travel that long to find a match, though you would not notice much difference as it is already fast enough, can we make that super-fast?

Let me introduce you to Node Teleportation (Sorry, I don't know a better name to give this)

According To Wikipedia:

Teleportation is the hypothetical transfer of matter or energy from one point to another without traversing the physical space between them

Without doing what again? Traversing the physical space between them!

and that is exactly what we would be doing, I believe you can already picture how fast this would be.

While matching a route, we can skip traversing nodes by teleporting to the desired node, best case, you only need to teleport ones, and worse case, you might do more than one teleporting with a bit of traversing, either way, it is faster than just traversing.

But!

It is not that simple, let's consider the previous example and let me quickly use my pen and paper for a quick sketch:

$dynamic = str_repeat("/:id", 1000);
$route->get("/post$dynamic", function (){});

Suppose you want to match: /posts/hi/hi/.../hi

A tree structure with nodes pointing to the next node

Without teleporting, you would have to step on each box in the above image.

With teleporting, you can skip stepping on the box, and just skip to the last box like the below image:

A tree structure with nodes pointing to the next node and an arrow showing teleportation from the root to the last node

Easy enough, but what if we have the following routes defined:

/post/:id/:id/:id
/post/:id/cat/3

The tree would look something like so:

A tree with nodes and its siblings

If we want to match the route: /post/:id/cat/3 where :id can be dynamic, how do we do that?

Teleporting from the root [/] to [3] is not hard but also not straightforward, the root or the current parent node wouldn't know ahead of time where to teleport since there are several nodes it can reach, it is ambiguous.

You might end up teleporting into an ocean.

One way to solve this problem is by keeping references to possible destinations you can teleport. (We would use this approach for a new problem entirely)

Then based on the last path of the route you are trying to match, in our case [3], you can teleport.

But!

There are too many problems with the above approach, you would have to keep track of not only possible destinations but also siblings' information, what if a node has more than one sibling, a lot of what if might crop up, including the one's you didn't account for.

Here are the rules, I devised for node teleporting and they should be followed in order:

Note: Teleported Node means the node you are teleporting to.

Teleported Node Must Not Have Siblings

Refer to the above image.

When teleporting to a node in the tree, ensure that the destination node does not have any siblings. In other words, it should be the only child node of its parent.

This rule ensures that there is no ambiguity in the path traversal and that the teleported node is uniquely identified.

So, in our case, we first teleport to [:id], the one under [post], then we figure out the next node to traverse which in our case is [cat], then finally, we teleport to [3].

Only Teleport To The Nearest Descendant

This depends on how your routes are defined.

 The term nearest descendant refers to a child or any subsequent child node that is closer in hierarchy or level within the tree.

Let me give you an example, say we have the following routes:

route->get( '/:country', function (){});
route->get( '/:country/:state/:suburb', function (){});
route->get( '/:country/:state/:suburb/:street', function (){});

and you are trying to match /Nigeria/Lagos/Ikosi/Jimoh route, since all the descendant nodes, are possible destinations we can teleport, we can't know ahead of time where exactly we want to teleport.

To make the algorithm easy to implement, we can store the possible destinations, and based on the route, we teleport to the nearest descendant.

For the root [/], the possible destination can either be :country or :suburb or :street

nodes: array:1 [β–Ό
          -routeName: "/"
          -teleportNode: array:3 [β–Ό
            2 => /:country
4 => /:country/:state/:suburb             5 => /:country/:state/:suburb/:street ]
]

We teleport to /:country(2) :suburb(4) and then to :street(5).

Only add possible destinations for routes that are defined, there is no point adding it for say:  /:country/:state/ routes, we are never matching that path, so no point in adding it as a possible destination.

I hope that is clear.

Note: If you want to further speed things, you might even teleport directly to /:country/:state/:suburb/:street without having to bounce around.


Since the total count of the path we are trying to match is 5 ([/], [Nigeria], [Lagos], [Ikosi], [Jimoh]), check if the match exist in teleportNode property, if it does, teleport. If you are tryint to match  ([/], [Nigeria], [Lagos], [Ikosi]]), the count is 4, check if it exist and teleport.


If you can't match the count directly, check for the shortest path.


The reason this is easy is because we have a simple route path, if it happens that we have a sibling node in the middle of one of the descendant. As long as you are following the first rule, you should be fine, it would inform you that particular path isn't safe no more for teleporting.


All of this info should have been computed while building the tree.


I hope you understand.

Node Should Be Added By Using The Route Name as Its Key

This is obvious, I not only forgot to point this out in Part One, but I also didn't do so in my router code, it became obvious while reading Part One.

While inserting a child node into the parent, make it associative, instead of pushing the node to the list of child nodes, use the child node route name as the key and the node itself as the value, instead of doing:

$childNodes[] = $node;

Do the following instead:

$childNodes[$node->getName()] = $node;

Don't worry, this is safe for the tree structure (though, you might need to do some cleaning when adding a dynamic node).

The benefit of this approach is when searching for a specific child in a parent node, you can easily check if the child node exists with a single function call (isset, or array_key_exist), which totally eliminates the need for traversing the childNodes for a match.

If you are wondering how we then search for a dynamic route name, you are on the right track.

For every parent, there can only be one dynamic child node, by dynamic, I mean, the route name is not static.

What you should do is keep a reference in the parent property whenever a dynamic node has been added or updated.

Here is what I mean:

/p/:id/c

is the same as

/p/:slug/c

The dynamism is on the same path, so, it should not matter if you handle the tree building correctly.

So, keep a reference to the updated dynamic child node in the parent property, the parent in this case is [p], and the latest dynamic child node is [:slug].

However, the path:

/p/q/c

differs from

/p/s/c

since the parent [P] node can have several child nodes as long as they are static. 

I hope you get that.

Let's do some benchmarking...

Benchmarks

Here is a link to TonicsRouter: https://github.com/tonics-apps/tonics-router-system, by the time you are reading this, I would have pushed the changes.

Here are the functions for the benchmark setup:

function calculateAverageExecutionTime(callable $benchmark): string
{
    $totalExecutionTime = 0;
    $iterations = 1000;
    for ($i = 0; $i < $iterations; $i++) {
        $totalExecutionTime += $benchmark();
    }
    return $totalExecutionTime / $iterations;
}

function printBenchmarks(array $benchmarks): void
{
    $time = array_column($benchmarks, 'Time');
    array_multisort($time, SORT_ASC, $benchmarks);
    foreach ($benchmarks as $benchmark){
        echo $benchmark['Name'] . ' ' . sprintf("%f", $benchmark['Time']) . "<br>";
    }
}
  • calculateAverageExecutionTime function would be used to calculate the average execution time, we iterate 1k times and calculate the average.
  • printBenchmarks would print the benchmark result, the fastest would be at the top, and the slowest at the bottom

First Route in a Routes of 1000

I would generate 1000 routes with a few dynamic placeholders, then we can benchmark that.

If you wondering how I got an idea about the below route definition, then I saw it in this medium post:

https://nicolas-grekas.medium.com/making-symfony-router-lightning-fast-2-2-19281dcd245b

This section:

...matching the /abc399/399 URL 10k times takes just 10ms.


That’s 1 million match/s.! That’s also 15x faster than FastRoute, and 77.7x faster than our best starting mark before any optimizations.

So, I decided to generate more, something like: /post1...post1000, then we benchmark how far it can travel up or down.

Setup
        #------------------
    # TonicsRouter
#---------------------

for ($i = 1; $i <= 1000; $i++) {
    $tonicsRoute->get("/post$i/:tell/:me/:your/:name", function (){});
}
$treeGenerator = $tonicsRoute->getRouteTreeGenerator();

        #------------------
    # FastRouter
#---------------------
$routes = function (\FastRoute\RouteCollector $routes) {
    for ($i = 1; $i <= 1000; $i++) {
        $routes->addRoute('GET', "/post$i/{tell}/{me}/{your}/{name}", 'handler');
    }
};
$dispatcher = FastRoute\cachedDispatcher($routes, [
      'cacheKey' => "/path/to/FastRoute.cache",
      'cacheFile' => "/path/to/FastRoute.cache",
        'cacheDisabled' => false,
    ]
);

        #------------------
    # SymfonyRouting
#---------------------
$symfonyRoutes = new RouteCollection();
for ($i = 1; $i <= 1000; $i++) {
    $symfonyRoutes->add("post$i", new SymfonyRoute("/post$i/{tell}/{me}/{your}/{name}"));
}
$dumper = new CompiledUrlMatcherDumper($symfonyRoutes);
$compiled = $dumper->getCompiledRoutes();
Run Benchmark
printBenchmarks(
    [
        [
            'Name' => 'TonicsRouter',
            'Time' => calculateAverageExecutionTime(function () use ($treeGenerator){
                $startTonics = microtime(true);
                $treeGenerator->match("/post1/my/name/is/tonics");
                return microtime(true) - $startTonics;
            })
        ],

        [
            'Name' => 'FastRoute',
            'Time' => calculateAverageExecutionTime(function () use ($dispatcher){
                $start = microtime(true);
                $dispatcher->dispatch('GET', '/post1/my/name/is/tonics');
                return microtime(true) - $start;
            })
        ],

        [
            'Name' => 'Symfony Router',
            'Time' => calculateAverageExecutionTime(function () use ($compiled){
                $request = new Request();
                $request = $request->create("/post1/my/name/is/tonics");
                $requestContext = new RequestContext();
                $start = microtime(true);
                $matcher = new CompiledUrlMatcher(
                    $compiled, $requestContext->fromRequest($request)
                );

                $matcher->matchRequest($request);
                return microtime(true) - $start;
            })
        ]

    ],

);
Result

Sometimes TonicsRouter wins, and other time FastRoute take the lead position as you can see in the video proof, but I would give it to FastRoute because most of the time it leads.

  1. FastRoute 0.000001sec πŸ† (Winner) - 2 times faster than TonicsRouter and 20 times faster than SymfonyRouting
  2. TonicsRouter 0.000002sec - 10 times faster than SymfonyRouting
  3. SymfonyRouting 0.000020sec
Proof

Here is a video showing that I am using both the cached version of FastRoute and SymfonyRouting:

In The Middle of 1000 Routes

Let's find the route /post500/my/name/is/tonics

Setup

Use the same setup as the one above

Run Benchmark

Change every post1  to post500, e.g: /post1/my/name/is/tonics to /post500/my/name/is/tonics

Result

TonicsRouter won, sometimes, SymfonyRouting is above FastRoute, other times vice versa, and sometimes equal execution time, its a tie.

  1. TonicsRouter 0.000002sec πŸ† (Winner) - 9 times faster than the both
  2. SymfonyRouting  0.000018sec
  3. FastRoute 0.000018sec
Proof

Last Route in a Routes of 50,000

Let's generate 50k routes and match the last one, e.g /post50000/my/name/is/tonics

Setup

Same as the first, the only thing you should change is 1000 to 50000, also ensure to regenerate the cache in the case of FastRoute, clear it.

Run Benchmark

Change every post500 from the previous benchmark to post50000, e.g: /post500/my/name/is/tonics to /post50000/my/name/is/tonics

Result

What? The execution time of TonicsRouter never changes, if you are wondering why, refer to the performance improvement section.

In this benchmark scenario, even if you have 5 million routes, TonicsRouter would always have the same execution time, well, as for the others, they begin to suffer the more routes you have.

  1. TonicsRouter 0.000002 πŸ† (Winner) - 407.5 times faster than SymfonyRouting, and 4405.5 than FastRoute
  2. SymfonyRouting 0.000815 - 10.8 times faster than FastRoute
  3. FastRoute 0.008811
Proof

Game With BitBucket Routes πŸ†

I have prepared a public gist for this here: https://gist.github.com/devsrealm/d123b034db85b5856c7d1569ba983786

Ensure you change the autoload.php to the appropriate one.

We are gonna be randomizing a list of routes from BitBucket (contains about 177), and we print the router that can match faster with the execution time:

Conclusion

Which Router is Faster?

I leave that as an assignment to the reader.

Hire Me

If you like what I do, I am available for hire, my email address is on my Hire Me Page: https://tonics.app/posts/f7b4f201bc4a91bb/hire-me, or just simply mail me at olayemi@tonics.app

Thank you.