Women in Technology

Hear us Roar



Article:
  The PHP Scalability Myth
Subject:   Fundamental Platform Differences and why PHP scales better
Date:   2003-10-16 06:35:47
From:   anonymous2
One thing the article failed to make clear is the fundamental difference between PHP, running as an Apache module, and a Java application running on an application server.


Put simply PHP is stateless - unless you store it externally (such as the server side of a PHP Session cookie) it's lost come the next request between requests. Very few PHP applications do more than store user session data between requests.


The longer explaination is when you make a request to a PHP script, Apache passes the request to the PHP engine to handle, which in turn first parses the PHP script (this stage can be reduced script caching such as Zend Accelerator) then executes it, re-building any data, objects etc. for that request.


The concern that this inherent stateless is bad news comes from a fundamental lack on understanding of how to take advantage of it.


For example if you choose to implement a Front Controller in PHP in a Struts-like fashion, parsing a giant XML file _ON EVERY REQUEST_, you've got it wrong. A large part of design PHP applications is to focus on page controllers which restrict themselves only to creating the objects / data they need, _not_ loading every concievable class / object for all your Page Controllers. Arguably, Apache is PHP's Front Controller.


Done right, PHP actually scales better exactly _because_ of it's inherent statelessness - there's no concerns relating to how to replicate runtime data across seperate machines. All you need to do is replicate the a mimimal user session store (which hopefully you didn't put in your database!) and as the article points out, pass persistent data form a form using the form itself.


The bottom line - what would you rather do? Use the replication at a network, filesystem and database level to achieve scalability (where it's fast, easy to maintain and supported by mature technologies) or would you rather do it at an application level (i.e. application server level) where you're relying on immature technologies that are re-inventing wheels which already exist (usually with significant overhead e.g. application level messaging protocols) and have an extra layer of complexity to worry about.

Full Threads Newest First

Showing messages 1 through 10 of 10.

  • Fundamental Platform Differences and why PHP scales better
    2003-10-16 09:55:26  anonymous2 [View]

    > All you need to do is replicate the a mimimal user session store (which hopefully you didn't put in your database!)

    I'd like to hear the reasoning behind this. Is it performance issue (files over db)? Or scalability (at least dbs can be replicated)?
    • Fundamental Platform Differences and why PHP scales better
      2003-10-16 12:52:53  anonymous2 [View]

      It's both a performance and scalability issue but perhaps wasn't specific enough simply to say "no databases".

      Regarding scalability, essentially you don't want to store high demand but disposable user session state data (which in a desktop app, for example, you'd keep in memory) in the _same_ database as the real application data, otherwise you can't scale the layers seperately.

      Regarding performance, it's a good idea to keep session data as "close" to the script that needs. You might use a seperate, _local_ database (PHP5 has SQLite built in which, interestingly, is capable of storing data in memory as well as a file) but, arguably, you want the path to session store to be as clear as possible (certainly if it needs a network call, that's bad news).

      Also from an maintence point of view, the less complexity you add, the better your chances when things go wrong and the lower your operating costs.

      As you scale, you can turn to stuff like IBM's Parallel Linux File System for handling session data.
      • Fundamental Platform Differences and why PHP scales better
        2003-10-16 12:58:36  anonymous2 [View]

        MySQL is actually quite a bit faster than SQLite for simple single-key selects like this. It makes a good session database. Memcached (http://www.danga.com/memcached/) is good for this too.
        • Fundamental Platform Differences and why PHP scales better
          2003-10-16 20:21:09  anonymous2 [View]

          MySQL HEAP tables could be an option too I think, the ones that stay in memory.

          • Fundamental Platform Differences and why PHP scales better
            2003-10-17 10:23:50  anonymous2 [View]

            That's what I use (with a custom sessions handler), and it works great.
      • Fundamental Platform Differences and why PHP scales better
        2003-10-17 13:38:11  anonymous2 [View]

        In an ideal world, yes, it would be fantastic to always have an up-to-date session store right in the same machine as the code that's trying to access it. Unfortunately, in any kind of horizontally scaled solution, it's not always possible. If you are dealing with a large group of users, there will come a point where your HTTP front end (whether it be a servlet or PHP script) WILL have to consist of more than one machine. That means, you'll have to either synchronize sessions between the application servers or store the session data in a central repository that all app servers have access to (through either a roll-your-own solution using PHP database calls or though RMI communication to an EJB layer that's handling persistent session data). Either way, the network call is a necessary burden to get your application to scale in any kind of meaningful way.
      • Fundamental Platform Differences and why PHP scales better
        2003-10-17 13:41:13  anonymous2 [View]

        In an ideal world, yes, it would be fantastic to always have an up-to-date session store right in the same machine as the code that's trying to access it. Unfortunately, in any kind of horizontally scaled solution, it's not always possible. If you are dealing with a large group of users, there will come a point where your HTTP front end (whether it be a servlet or PHP script) WILL have to consist of more than one machine. That means, you'll have to either synchronize sessions between the application servers or store the session data in a central repository that all app servers have access to (through either a roll-your-own solution using PHP database calls or though RMI communication to an EJB layer that's handling persistent session data). Either way, the network call is a necessary burden to get your application to scale in any kind of meaningful way.
        • Fundamental Platform Differences and why PHP scales better
          2003-10-18 02:50:10  anonymous2 [View]

          "That means, you'll have to either synchronize sessions between the application servers or store the session data in a central repository that all app servers have access to[...]Either way, the network call is a necessary burden to get your application to scale in any kind of meaningful way."

          True but, as you mention, synchronizing sessions between servers means the network calls are, to some extent, made outside of the user request cycle. Depends on the technology you're using.
  • Fundamental Platform Differences and why PHP scales better
    2003-10-17 04:57:36  acostin [View]

    implement a Front Controller in PHP in a Struts-like fashion .. is not possbile



    I have to say that there is a way to do this in PHP, based on other language features we have - that is conditional file require. What we do in the Fundamental Platform Differences and why PHP scales better
    2003-10-17 05:00:47  acostin [View]

    It seems that my previous post was broke because of an URL....

    I have to say that there is a way to do this in PHP, based on other language features we have - that is conditional file require. What we do in the Krysalis platform - http://www.interakt.ro/products/Krysalis/ is to compile the Controller (sitemap.xml) to a suite of optimized PHP files that are required as needed.



    Alexandru