As a general preloading file seems to become a thing in PHP 7.4 (🎉) I think we should start the discussion on how this could be implemented in Composer. I'm willing to work on a PR for that but there are a few things to be clarified first.
Here are just a few thoughts that come to my mind when thinking about the implementation:
vendor/preload.php
?composer dump-autoload -o -p
(-p
would then also generate the preload.php
)I'm sure there's more to be considered and discussed. There are some very smart cookies in our community, so let's try to lay out the best solution together first and only then start coding 😊
My first idea on the way this would work would be like the files
autoloader, a new preload
autoloading type. That way every package can declare file(s) that must be preloaded, and we generate a single file with all the includes. Those files can then have more smart logic on what to preload. I'd expect symfony would for example preload according to the container config and whatnot.
I don't think preloading all classes always is a good idea, that'll just blow up memory usage for no reason.
This is also very cheap to generate much like the files autoloading it's simply dumping an array of files in preload.php so it can be built always no need to be optional like optimized autoloader.
I see. That's a nice idea too!
I don't think preloading all classes always is a good idea, that'll just blow up memory usage for no reason.
I agree, it makes no sense to load all the classes if not all of them are used. However, as the developer of an app I would like to be able to optimize this myself so I don't think it makes sense to give the responsibility to define the classes that need to be preloaded to the developer of the library. Let's take the Symfony VarDumper component as an example: In 99.5% of all cases, this is used for debugging purposes only so it likely makes no sense to preload classes of it. But what if somebody builds an API service that uses this component to style some output? Preloading would make sense there then.
In other words: Whether or not preloading a class makes sense depends on how it's being used and more often than not, the developer of the library cannot tell how it's going to be used.
I also wondered how much RAM usage we're talking about. So here are some stats for everybody:
composer create-project symfony/skeleton
and called the welcome page it in prod
environment (debug
set to true
).Then I ran composer dump-autoload -o --no-dev
and edited my index.php
to do some simple preloading based on the autoload_classmap.php
. So I really just used the _preload()
function from the RFC and added this to my index.php
:
$classmap = include './../vendor/composer/autoload_classmap.php';
$classmap = array_unique($classmap); // Needed because we have multiple classes in some files
_preload($classmap);
I reset the cache and visited the welcome page again: Memory usage increased to 27.74 MB (919 cached files).
So yes, memory usage does increase but is it really that significant? I mean, if you enable preloading in the first place you're after good performance, right? Is it a problem then that your app uses a fair amount of RAM constantly?
All I'm trying here is really to just throw in some numbers so we can weigh up the pros and cons.
| Variant | Pro's | Con's |
|--------------------------------|--------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
| Preload classmap | Super simple;
Easy to implement;
Easy to use; | Needs more RAM; |
| New "preload" autoload section | More memory efficient; | Package devs have to learn it;
No control for app dev;
Likely to miss out files on the hot path; |
Maybe there are more variants? 😄
Here's the numbers of a bigger project using ApiPlatform, Doctrine, Guzzle, Enqueue, Symfony Translator, Redis, Symfony Console etc.):
Index page(docs endpoint) without preloading everything: 33.36 MB (15.29 MB net)
Index page(docs endpoint) with preloading everything: 75.31 (57.24 MB net)
So you need more then double amount of ram. In other words you can serve only less then half the users in comparison to non-preloaded with the same server
Well, that's not a fair comparison. I called just one endpoint. I would need to call the other endpoints that use other components one after the other if we wanted to find out the percentage of "useless cached files". I think I'd get a lot closer to the 57.24 MB if I did that 😊
Still a lot of overhead. More then I would like to pay to call the feature usefull when loading all the things
IMO loading everything only works for small apps
I appreciate your enthusiasm, buuut I don't think any of this is composer's responsibility. I fully agree that it's most likely app specific what you want to preload, but there are two things to consider in my proposal:
preload
autoload doesn't have to be defined by every package. Typically I'd expect symfony to define that, or maybe some crypto lib that highly benefits from preloading assuming we get JIT optimizations in there.Lastly regarding the option of preloading the whole classmap, that's a 3-line foreach loop that you can write as your own preload script loading all classes from composer's classmap if you are so inclined to waste memory ;) It could also be offered as a package toflar/preload-all-the-things
that has a preload
autoload which then goes and includes all files from the classmap.
So my position at least for now is to either do some very simple thing to facilitate things and "standardize" on a vendor/preload.php
file, or alternatively we do nothing at all.
I tend to agree with @Seldaek here.
The preloading-everything strategy is already straightforward once you generate a full classmap for the optimized autoloader (which you should do anyway if you care about performance, and you don't need preloading if you don't care).
Any smarter algorithm would have to rely on the structure of the project, and so it might be hard to deal with that in Composer.
In Symfony, it would probably make no sense to have a config in the package deciding which Symfony files are preloaded for all Symfony projects. But Symfony could be generating a preload files for projects using it (and so with class coming from Symfony, but also from other dependencies or from the project itself) based on some heuristic. this is being suggested in https://github.com/symfony/symfony/issues/29105#issuecomment-436564272 (I took the example of Symfony here, because I know about the current state of the discussion here, but the same could apply for other frameworks of course).
BTW: I'm not inclined to waste memory at all. It was just one variant that came to my mind and so I elaborated on it. I think it's good to consider multiple approaches, also for the people that read the issue later on. I'm perfectly fine with having the bad ones ruled out 😄
We could allow the preload
section only in the root composer.json
(so project specific) and that's where you specify the files you like to be included. Whether or not you use other files that are dynamically generated by e.g. Symfony is your business then. But the important feature here would be Composer that lets you aggregate out of the box (and maybe we can get the default php.ini setting to be set to vendor/preload.php
in php itself which would just be ignored if it doesn't exist 😄).
@Toflar if it is root-only, why asking to put them in composer.json so that composer requires them in another file that you can then reference in your php.ini ? You could reference your own file directly in the php.ini.
I know 😄 The only thing it would do is somewhat "standardizing" the way to do it. Nothing else 😊
I would think that creating levels for autoloading like logging levels would allow for some sort of "automated" control with the library providing the files/classes that would be appropriate for the level. An uber level like "all" could include all files including the files for libraries not supporting the levels. Sounds like something that might fit into the PHP-FIG realm of discussion as well.
Some disorganized thoughts:
1) A way to whitelist files to preload needs to include regex support; in practice, most significant applications load hundreds of classes on every request. I'd rather eat the cost of preloading a few more than I need than having to list them all out manually.
2) One important question is that, AIUI, the increased memory usage is shared, isn't it? Viz, if we preload 50 MB worth of classes, does that increase the memory usage of every request by 50 MB or does it increase the base cost of having FPM processes by 50 MB but net reduce the memory per request? (I honestly don't know here, but it's a distinction worth verifying.)
3) Since Composer is basically the universal autoloader at this point, is there a way that Composer can assist in determining what good candidates are to preload? I'm not entirely sure how it would do that without writing data to disk, which is probably undesireable, but if we're going to say "site owners, this is your job" we should try to give them enough information to make that job really easy. What they're going to want is a list of the X most loaded classes, or the classes used on more than Y% of requests, or something like that. Is that something Composer can help compute, and if not, what would?
4) While I'm sympathetic to the simplicity of "preload all the things", there are some files that MUST NOT be preloaded. For instance, at Platform.sh we use a composer-loaded file
(not class) to execute code before the application initializes. That lets us map host-provided environment variables into application-expected environment variables. The process works pretty well but that code needs to be run on every request, so the file has non-symbol-definition code (viz, it violates PSR-1), and so preloading it would break things. So it's probably useful for any auto-preload-builder thingie (Composer or otherwise) to include a way to let packages blacklist certain files that should never be preloaded.
5) I don't really see a place for FIG here; What would be in scope for FIG would be "hey packages, here's how you expose your preload info". But really, even if we decide that's a package's job to do (and it may be), 99.999947% of the time that will be via composer.json, which is out of FIG's purview.
I totally agree that preload
should be a separate section from autoload
as they are different concepts. For one, you can use an autoloader in your preload script:
<?php
spl_autoload_register(function($name) {
include_once("$name.php");
});
use Foo;
About what should be preloaded and not, keep in mind that to refresh the preloaded files you must restart php. I think the most typical use-case would be to preload your vendor but keep your application out of it so you can deploy changes to your application without a server restart.
One important question is that, AIUI, the increased memory usage is shared, isn't it? Viz, if we preload 50 MB worth of classes, does that increase the memory usage of every request by 50 MB or does it increase the base cost of having FPM processes by 50 MB but net reduce the memory per request? (I honestly don't know here, but it's a distinction worth verifying.)
That's the same question I was asking myself. Because I understand it the way that you preload stuff that would be loaded into memory later on anyway. Just not on every request but shared. Which in turn means that "preloading all the things" doesn't effectively change the amount of memory used except for the percentage of classes that are not needed aka classes that are shipped with a library but never used within the context of the app.
But not quite sure either but very important to know, indeed.
@Toflar your response seems to assume that all your requests are using all the classes from the codebase (so used by the project is the same than used by the request). For most projects, that's not the case. Many classes are used only by some specific requests rather than by all of them.
If Composer does decide to do something, it should be root-only. Letting your dependencies decide what to preload is not helpful.
One important question is that, AIUI, the increased memory usage is shared, isn't it? Viz, if we preload 50 MB worth of classes, does that increase the memory usage of every request by 50 MB or does it increase the base cost of having FPM processes by 50 MB but net reduce the memory per request? (I honestly don't know here, but it's a distinction worth verifying.)
Like @Crell and @Toflar, I was asking myself the same thing. I highly doubt that the memory used by preloading classes is copied to every single PHP process, I guess it's shared memory (to be verified), so preloading the whole stuff would "only" eat ~100MB, but every PHP process thereafter maybe eats less memory?
I'm personally in favour of having composer.json generate a preload.php script that contains by default all the files from the preload
section of the project itself and the vendor dependencies.
I don't mind if there's a way to exclude some files explicitly, though, if these files are most likely never used in the average project. But I'm not sure whether any such file would actually belong to the autoload
section then, they would probably be dev classes used in autoload-dev
only?
While I'm sympathetic to the simplicity of "preload all the things", there are some files that MUST NOT be preloaded. For instance, at Platform.sh we use a composer-loaded file (not class) to execute code before the application initializes. That lets us map host-provided environment variables into application-expected environment variables. The process works pretty well but that code needs to be run on every request, so the file has non-symbol-definition code (viz, it violates PSR-1), and so preloading it would break things. So it's probably useful for any auto-preload-builder thinill gie (Composer or otherwise) to include a way to let packages blacklist certain files that should never be preloaded.
As far as I understand it:
opcache_compile_file()
them; I would suggest that the preload.php file is just a big list of opcache_compile_file()
statements;The only issue I can think of, for your use case, is that class_exists()
will return true because the file has been preloaded, so this might not trigger the autoloader; you would therefore have to include() the file explicitly. I would be curious to see what your motivation is for mixing class declarations with other code, though: should this—usually not recommended—approach prevent composer from doing the right thing for most other users? No offense here, but I think that composer should aim to support out of the box the recommended approach, and maybe your slightly exotic approach should require a custom preloading script?
@Toflar your response seems to assume that all your requests are using all the classes from the codebase (so used by the project is the same than used by the request). For most projects, that's not the case. Many classes are used only by some specific requests rather than by all of them.
Then preloading is not for these projects, as explicitly mentioned in the RFC:
And also, this approach will not be compatible with servers that host multiple applications, or multiple versions of applications - that would have different implementations for certain classes with the same name - if such classes are preloaded from the codebase of one app, it will conflict with loading the different class implementation from the other app(s).
As such, here is how I would implement preload.php (my 2 cents):
<?php
// list here all the files that are generated by the optimized autoloader
opcache_compile_file(...);
opcache_compile_file(...);
opcache_compile_file(...);
...
This would be the kind of file that would be generated with no extra configuration. It's there, you can use it if you want, but you don't have to.
Optionally, I would add a preload-exclude
or equivalent composer.json option, that allows to exclude individual files, or entire directories, or even vendor dependencies.
For those having multiple versions of an application / dependency: to reiterate, I would advocate to not use preloading at all. Or if you're feeling brave enough, fiddle with preload-exclude
or write your own preload script.
For all other users out there, just including the auto-generated preload.php will work like magic.
Maybe we can get @dstogov to help us out to make the right decision for the PHP community here 😊
@BenMorel:
@Toflar your response seems to assume that all your requests are using all the classes from the codebase (so used by the project is the same than used by the request). For most projects, that's not the case. Many classes are used only by some specific requests rather than by all of them.
Then preloading is not for these projects, as explicitly mentioned in the RFC:
And also, this approach will not be compatible with servers that host multiple applications, or multiple versions of applications - that would have different implementations for certain classes with the same name - if such classes are preloaded from the codebase of one app, it will conflict with loading the different class implementation from the other app(s).
You've misunderstood @Toflar's comment about some classes only being used in some requests of the application vs multiple applications / multiple versions of applications. They're not the same thing at all.
@teohhanhui I don't think I misunderstood @Toflar's comment, I was quoting @stof who was himself replying to @Toflar.
To clarify my thoughts:
Maybe we can get @dstogov to help us out to make the right decision for the PHP community here blush
I think, preloading is a very new feature to immediately implement its support in composer.
Adaptation of applications and frameworks for preloading should identify best solutions, missing functionality, etc.
I tried preloading the whole frameworks (ZendFramework) and application specific preloading (getting the list of used PHP scripts through opcache_get_status() and generating a list of opcache_compile_file(...)). The second approach works better.
Read about usage of Java Class Data Sharing. We implemented similar technology, and may borrow use cases.
I tried preloading the whole frameworks (ZendFramework) and application specific preloading (getting the list of used PHP scripts through opcache_get_status() and generating a list of opcache_compile_file(...)). The second approach works better.
Thanks for jumping in, @dstogov!
Could you please explain what you mean by works better? Did you get better performance? Did preloading all the classes take up too much memory? Did you have any issue?
@dstogov Can you clarify the question above regarding the memory usage of a preloaded class? Viz, if I have 100 classes that are used on virtually every request anyway, and I then preload all of them, we know that's going to save CPU time. However, is it going to increase, decrease, or have no effect on memory usage?
Similarly, if we preload 10 MB worth of code that is used only on a small fraction of requests, and there are 10 concurrent requests, have we now increased total memory usage by 10 MB (shared memory) or 100 MB (cost in each process)?
@BenMorel The example from Platform.sh is not a class at all. It's a file that looks something like this:
<?php
function stuff() {
$_ENV['db_name'] = $_ENV['dbname'];
}
stuff();
(Because the application wants an environment variable with one name and our system by default provides it with another. This is a very over-simplified example but it gets the idea across.)
That file is then included by composer so it runs during autoload, before the application looks for its environment variables. There's nothing intrinsically wrong with that approach and it works quite well right now. My point is that such a file MUST NOT be preloaded, because then it won't actually run on subsequent requests and break the application. We don't need to do anything special here for it other than make sure that it doesn't get picked up and preloaded accidentally by whatever mechanism Composer ends up using.
@Crell preloading such a file would have zero effect the way I understand it. As it doesn't declare a preloadable class it is ignored, and will be executed at runtime when included.
@Crell OK, I see where you're coming from. I have similar config files myself, but I don't rely on composer to include them. Instead I manually include() such files in a bootstrap class, i.e. I call Bootstrap::bootstrap()
at the top of the entry script.
I think @Seldaek is right though, preloading such a file with opcache_compile_file()
should do nothing as it doesn't declare a class. It should still be executed at runtime when you include vendor/autoload.php
as usual.
So my suggested approach above should not break the way you currently work: you should be able to have your php.ini point to vendor/preload.php
, and still include vendor/autoload.php
in your code; everything should work as before.
To be confirmed by @dstogov, or by testing the current implementation.
I tried preloading the whole frameworks (ZendFramework) and application specific preloading (getting the list of used PHP scripts through opcache_get_status() and generating a list of opcache_compile_file(...)). The second approach works better.
Thanks for jumping in, @dstogov!
Could you please explain what you mean by works better? Did you get better performance?
Yes. If you optimize preloading for application, you achieve better performance.
Did preloading all the classes take up too much memory?
Today 100M or 1G of memory is not a problem, especially, if it's shared among all the workers, but the more memory app uses, the more CPU cache misses we get, and the performance is worse.
Did you have any issue?
Preloading, doesn't work with all apps out of the box. For example, Wordpress includes some files depending on result of function_exists("foo") check, but if "foo" is preloaded these files are not included (not executed), and Wordpress fails.
@dstogov Can you clarify the question above regarding the memory usage of a preloaded class? Viz, if I have 100 classes that are used on virtually every request anyway, and I then preload all of them, we know that's going to save CPU time. However, is it going to increase, decrease, or have no effect on memory usage?
Most probably, you'll get significant improvement in memory usage. In any cases all the requested classes are stored in shared memory, but without preloading they are usually stored in "unlinked state". On each request, they are partially copied into each process memory, and linked with parent, interfaces and traits. With preloading, most of classes, should be already stored in shared memory in linked state and won't be copied into process memory at all.
Yes. If you optimize preloading for application, you achieve better performance.
Today 100M or 1G of memory is not a problem, especially, if it's shared among all the workers, but the more memory app uses, the more CPU cache misses we get, and the performance is worse.
Does that mean that preloading all classes of a project, as opposed to preloading just say the 90% most used classes, results in a measurable drop of performance?
Most probably, you'll get significant improvement in memory usage. In any cases all the requested classes are stored in shared memory, but without preloading they are usually stored in "unlinked state". On each request, they are partially copied into each process memory, and linked with parent, interfaces and traits. With preloading, most of classes, should be already stored in shared memory in linked state and won't be copied into process memory at all.
This is good news for those afraid that preloading the whole thing would consume more memory. It can actually decrease memory usage under heavy load!
So it sounds to me like the rule of thumb is/will be "preload as much as possible as long as you don't preload so much that you end up thrashing the CPU cache. Figuring out what that line is, well, that's your problem for your app/CPU; there's no generic answer."
Is that a fair summary to date?
In any case, I don't think composer can solve this problem for every individual use case, and still think that it should provide a preload.php script that preloads everything (app + vendor dependencies).
Again, you're free to use it or not: if you know how to fine-tune your app's preloading better, well, then don't use vendor/preload.php
! That's not a reason why composer should not generate one by default.
Also, @dstogov, I guess that preloading all files can hardly be slower than preloading nothing, can it? If so, then we have something to gain from this approach in pretty much every case, and with zero configuration.
@BenMorel — tell me, what's the purpose of loading all specified classes for a particular library? For example, when I use Guzzle within simple bot-project I just need to perform POST request and that's all. What is the point of preloading all available package classes when I use maybe 10% of its power? How should the Composer predict what set of classes has to be warmed-up?
@er1z
How should the Composer predict what set of classes has to be warmed-up?
It can't. That's precisely why I am advocating that Composer generates a catch-all preloader. I am confident that it should be faster than not using preloading (unless a benchmark proves me wrong, of course).
If you know how to fine-tune your preloading (for example using @dstogov's suggested approach above: "getting the list of used PHP scripts through opcache_get_status() and generating a list of opcache_compile_file(...)"), then you're free to not use the preload.php file offered by Composer, and use an alternate approach.
Now, I see the "preload everything" approach as a default only: nothing prevents Composer from offering a way to specify which dependencies or directories to preload!
So why waste resources by preloading all from package?
@er1z To reiterate: this would just be an optional default: if you want to speed up your app with zero configuration, then you could use the default preload.php file, which is better than nothing, isn't it?
If you know what classes/dependencies you use most often, then you could tell composer to preload only those, should composer provide such a configuration option.
If you can actually profile your application and generate a preload PHP script from opcache_get_status()
, then do so. I think a third-party script that does this would be nice, but I don't think this belongs to composer.
If you don't want to use the default preloader, then don't. Of course there will be some waste of memory (and maybe a few CPU cycles) by using a non-optimized preloader, but if the end result is better performance anyway, and less memory used per request, then why not?
I ran a benchmark with a medium-size project of mine (90 package dependencies), that includes on every request a quite heavy bootstrap script that sets up classes for dependency injection; this makes it a good candidate for benchmarking class loading in real life conditions.
I benchmarked a simple page of the website, that makes very little use of the database, but alone triggers the autoloading of 380 classes. I tested the following configurations:
I used composer's optimized autoloader: composer install --optimize-autoloader
. The opcache was warmed up by loading the page manually.
I restarted the server, loaded enough pages from the website, then used this script to generate a preload file from cached files, as reported by opcache:
<?php
header('Content-Type: text/plain');
echo '<?php', PHP_EOL;
$status = opcache_get_status(true);
foreach ($status['scripts'] as $script) {
$path = $script['full_path'];
echo 'opcache_compile_file(', var_export($path, true), ');', PHP_EOL;
}
This script preloads 878 files.
I used the following preload script, that preloads the whole composer classmap:
<?php
$files = require 'vendor/composer/autoload_classmap.php';
foreach (array_unique($files) as $file) {
opcache_compile_file($file);
}
This script preloads 14,541 files.
| Benchmark | Preloaded files | Server startup time | Opcache memory used | Per request memory used | Requests per second |
| --- | --- | --- | --- | --- | --- |
| No preloading | 0 | 0.06 s | 16 MB after warmup | 1,825 KB | 596 rq/s |
| Preload hot classes | 878 | 0.26 s | 21 MB | 869 KB | 695 rq/s |
| Preload everything | 14541 | 1.56 s | 105 MB | 881 KB | 675 rq/s |
We can see interesting performance benefits whenever we use preloading:
+ 13%
when preloading everything+ 16%
when preloading only hot classesAs predicted by @dstogov, execution is a bit faster when preloading only the classes used by a given project.
Server startup time is a non-issue to me, and the overhead of preloading everything vs preloading hot classes is only 84 MB here, so negligible on modern hardware.
What's interesting also, is that using preloading (everything or only hot classes, it doesn't matter) halved the memory consumption per request!
Looking at the above scripts though, they're so trivial that I'm starting to wonder if Composer should provide a preload script at all. Just copy/paste the above and you're done.
Benchmark info:
master
compiled from sourceopcache.memory_consumption=1000
opcache.max_accelerated_files=20000
Benchmarks have been run with Apache Bench, 1000 requests with 10 concurrent requests. All benchmarks have been run 50 times and the best result was used.
These results look awesome.
I do think that native Composer support makes sense (already guesstimated that there was going to be one in my ZendCon and PHP Ruhr presentations that covered this!), but there are several things to take into account:
This is really suitable only in the case of a single-app-per-server scenario, as different apps may bump into one another with conflicting dependencies - something you generally don't need to worry about without preloading. I don't believe Composer ever touches php.ini, so this shouldn't be an issue - as the last step of actually explicitly placing the preload.php file into php.ini would be a manual step that the user would have to be do proactively, but if Composer does ever update php.ini, it shouldn't automatically enable preloading under any circumstances.
As Dmitry mentioned, there are code patterns that can behave differently when preloaded - so far we've recognized function_exists(), and there may be other reflection-based code patterns that execute differently depending on what's loaded and what isn't, that could end up executing differently when preloaded vs. not.
All in all, if Composer simply provides a preload script that simply preloads all of the relevant classes as a convenience - and let's the user pull the trigger on actually using it - I think that would do the job nicely. It would probably be a good idea to include some comments at the top of that file (how to set it up, the fact it's for PHP >7.4, caveats to watch for, etc.).
BTW, on the benchmark, I would recommend running it for a slightly longer period of time. If I understand the stats correctly the benchmarks ran for just over a second (1000 requests at around 700 req/sec), which can typically end in not-so-stable results. Perhaps run with "-t 10" to see how many requests are squeezed in 10 seconds.
Thanks!
@BenMorel thanks for the benchmarks! You should not run composer install --optimize-autoloader
but composer install --optimize-autoloader --no-dev
though. Otherwise you will also preload all require-dev
autoload files 😄 So your benchmark should be updated there.
One other thing I wanted to comment on re: memory consumption - preloading files actually doesn't end up consuming significantly more shared memory than it would take to regularly include()/require() them and store them in opcache. It will take a bit more - as in some cases we'd be able to resolve a bit more inter-class dependencies during this stage and store the meta data in shared memory - but this should be negligible, as the main shared memory consumer are the actual opcodes (bytecodes). And of course, this is shared-memory-well-spent - as it saves both time and per-process memory later on during the request runtime.
Since most apps easily fit into several tens or hundreds of megabytes - which is really nothing on a server-wide basis (everything is shared), and since memory consumption per-process goes down significantly (and this is actually a lot more important, as it results in things like memory fragmentation) - I would think that's it better to err on the side of preloading too much than preloading too little.
Perhaps there can be a check that would alert the user in case the opcache shared memory size is inadequate for the amount of preloaded files? This can probably be implemented in the auto generated preload.php file, along with a recommendation on how to fix it - which should be very simple & cheap in most cases.
My 2c.
UPDATE: Dmitry already added this check & message in case the opcache runs out of memory/files right into the preloading implementation. So the userland implementation can be naive and just try to load everything.
These results look awesome.
They do, and even without preloading, considering the amount of work done (380 classes loaded and linked in real time, + running the actual code), it's incredible to be able to get 600 req/s on commodity hardware! Preloading is the cherry on the cake (before we get JIT 😉)!
All in all, if Composer simply provides a preload script that simply preloads all of the relevant classes as a convenience - and let's the user pull the trigger on actually using it - I think that would do the job nicely.
That's what I've been advocating so far, but I think we needed a benchmark to be convinced! Anyway, the end user is free to include this file in their php.ini, or run their own preload script.
BTW, on the benchmark, I would recommend running it for a slightly longer period of time.
You should not run composer install --optimize-autoloader but composer install --optimize-autoloader --no-dev though.
Good point to both of you! I ran again the benchmarks with --no-dev
and -t 10
. The number of files hasn't changed much, strangely: down to 14096. Here are the (even better) results:
| Benchmark | Requests per second | Diff |
| --- | --- | --- |
| No preloading | 631 rq/s | - |
| Preload hot classes | 738 rq/s | +17% |
| Preload everything | 712 rq/s | +13% |
The diff has only changed by a few decimal points, though!
Perhaps there can be a check that would alert the user in case the opcache shared memory size is inadequate for the amount of preloaded files?
UPDATE: Dmitry already added this check & message in case the opcache runs out of memory/files right into the preloading implementation. So the userland implementation can be naive and just try to load everything.
Exactly, here is the error message when starting the server with a too low opcache.memory_consumption
:
Fatal Error Not enough shared memory for preloading!
Exactly, here is the error message when starting the server with a too low
opcache.memory_consumption
:Fatal Error Not enough shared memory for preloading!
It will be slightly more informative from now on, and point people to consider increasing the opcache.memory_consumption or opcache.max_accelerated_files (accordingly).
I put together a small and very rudimentary Composer plugin to generate a vendor/preload.php
file from a given set of paths. I'd appreciate any feedback if you'd like to try it out.
But if the performance is actually better by just looking at the hot paths (opcache cached scripts after running multiple requests), does it even make sense to let Composer handle this? How would it be possible to determine what projects are actually 'hot'? Because if I require Guzzle, I might only want to use it on 1 of 1.000 requests, so doesn't make much sense to always preload it. But if I use it every single request, it does..
@barryvdh I've personally given my opinion on this, I'll summarize it here:
if anything, I think Composer should provide a preload.php
script alongside autoload.php
, that would be as simple as:
<?php
$files = require __DIR__ . '/composer/autoload_classmap.php';
foreach (array_unique($files) as $file) {
opcache_compile_file($file);
}
At least Composer would come with something that would provide a nice performance boost out-of-the-box. If people want to improve performance even more (by a few %), they can generate a preload script using a script like the one I used above, but I don't think such a script belongs to Composer.
Anyway it's so trivial that I wouldn't mind if Composer did nothing about preloading.
If I was in charge of the project though, I would still go for point 1. Quick, easy, efficient enough.
How could Composer analyse the application to get a list of hot classes? I think that for a well-done hot classes list it should go through heuristics or something so advanced that would out of the scope of Composer itself. Hot classes vary project to project, and the developer should be responsible for.
I can see that Frameworks like CakePHP, Laravel, Symfony and others could benefit because they have classes that always load, and offer to Composer a list of these through a preload.php
.
Another solution would be to have a package totally apart that could identify a list of classes being hit when a script runs, like during a standard request-response lifecycle, and then stop and return that list, leaving the developer to consider which of all need to be included or not.
Aside from all that, Composer should offer a key to manage the preloading. Then the developer could choose to include/exclude the preload of certain pacakges through the root composer.json
. So, if a package includes a huge list of classes that you barely use, you could exclude it, of override it.
It would be also cool to check how much memory a preloading could take.
But anyway, Composer stay away of logic to decide what is hot and what is not.
It would be handy if Composer included a key to manage how to include the preloading (in production and on development):
autoload
would just preload all the autoloading classes. file
would take preload scripts located somewhere with your own logic, like taking the autoloading class and remove certain classes, or add some of your own, whatever. The will be additive, meaning, if a class repeats itself it wouldn't matter since composer would just skip it.The preload only is taken from the root composer.json
. If a package has the key, good for them.
{
"preload": "autoload",
"preload-dev": {
"files": [
"my-app/preload.php",
"heuristic/preload.php"
]
}
}
The reason why I'm against of including this script (cool idea btw) is that means using opcache. Since Composer lives outside the application lifecycle and PHP itself, a package is needed to create a helpful preloading through analysis.
There, my two cents.
When will PHP 7.4 hotload be fully supported
How could Composer analyse the application to get a list of hot classes?
Most sites have opcache enabled which keeps statistics about the most used classes in the application. See @BenMorel script as above.
@zsuraski
You mention:
I would think that's it better to err on the side of preloading too much than preloading too little.
Could you elaborate on this? Are you suggesting if too little was loaded then the application could potentially not work as it has missing dependencies/linked information?
As projects can vary massively in how many packages they have installed I would personally opt to preload only "hot" files by default. There's many files in your project that will likely never get touched.
For example, we have installed aws-sdk-php this is required by league/flysystem-aws-s3-v3
and the project has over 1000+ PHP files which according to my apps opcache aren't ever cached (probably because we only use a tiny fraction of this package and only during a weekly cron task).
I'm personally not in favour of this "blind" approach to preloading everything in vendor
it seems like a poor overall strategy, especially if the analytics point towards hot loading being more efficient overall.
How could Composer analyse the application to get a list of hot classes?
Most sites have opcache enabled which keeps statistics about the most used classes in the application. See @BenMorel script as above.
@zsuraski
You mention:
I would think that's it better to err on the side of preloading too much than preloading too little.
Could you elaborate on this? Are you suggesting if too little was loaded then the application could potentially not work as it has missing dependencies/linked information?
As projects can vary massively in how many packages they have installed I would personally opt to preload only "hot" files by default. There's many files in your project that will likely never get touched.
For example, we have installed aws-sdk-php this is required by
league/flysystem-aws-s3-v3
and the project has over 1000+ PHP files which according to my apps opcache aren't ever cached (probably because we only use a tiny fraction of this package and only during a weekly cron task).I'm personally not in favour of this "blind" approach to preloading everything in
vendor
it seems like a poor overall strategy, especially if the analytics point towards hot loading being more efficient overall.
I always look into packages and features having two ways of configuration: hands-off and manual.
Judging by the analytics, the hands-off should let Composer take the most used classes in the application and preload them until a certain MB threshold (32~128MB by default seems good for an standard application). Priority of the classes would be the ones with more hits, and it will leave out those will less hits.
On manual, though, Composer should get an script that returns an array of Classes or Namespaces to preload.
The latter could be also good on production environments, since you could use a predetermined list of classes and namespaces to test the performance.
From what I see, the opcache has separate caches and statistics for fpm and cli, which makes any forced preload, statistics, or clean actions probably won't have the same effect in a web app context. I worked on the very same feature on my Composer-Preload plugin, but I couldn't figure out how to bypass the separate bins Opcache has for fpm and cli.
Well, that defeats he purpose until there is no "access" to the FPM
analytics, at least not directly.
Then, the only "sane" option would be have a package with a class that
could periodically update a list of hit classes while the application runs
in FPM, hopefully at the end of the request lifecycle (like 1 in 20, that
could be adjustable). It would add overhead but it would be minimal. It
would write a preload-ready class list with the most hits.
Then, turn down FPM, then use the class list to tell opcache to preload the
list.
It would be cool if opcache could give some analytics about each class
instance memory usage, so that way the list could leave out less used
classes outside a given memory limit each time is manipulated.
El mié., 10 de jul. de 2019 23:01, Ayesh Karunaratne <
[email protected]> escribió:
From what I see, the opcache has separate caches and statistics for fpm
and cli, which makes any forced preload, statistics, or clean actions
probably won't have the same effect in a web app context. I worked on the
very same feature on my Composer-Preload plugin, but I couldn't figure out
how to bypass the separate bins Opcache has for fpm and cli.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/composer/composer/issues/7777?email_source=notifications&email_token=ABHHLF2TZ5XBBRVDFEM2YRTP62PA3A5CNFSM4GCIOG4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZVLJGQ#issuecomment-510309530,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABHHLFYFXT5JWK4NFTVLPA3P62PA3ANCNFSM4GCIOG4A
.
Has anyone considered the memory usage on embedded systems where you actually do not have gigs and gigs of RAM available? If I want to host a small web server on some system that offers a humongous amount of 64 megabytes of RAM, I can preload like a dozen files and PHP will crap out if Composer has decided that "all in" is the best overall preload strategy?
(Somewhat exaggerated example, yes.)
I would say having a preload script generated only from the root project composer.json definition, and only with user-defined files is the best option, if Composer is to be used for preload file generation at all.
Preload should be disabled by default, and allow a custom list to preload.
That way wouldn't be any problems with ram management in any device.
El mar., 16 de jul. de 2019 09:07, Otto Rask notifications@github.com
escribió:
Has anyone considered the memory usage on embedded systems where you
actually do not have gigs and gigs of RAM available? If I want to host a
small web server on some system that offers a humongous amount of 64
megabytes of RAM, I can preload like a dozen files and PHP will crap out if
Composer has decided that "all in" is the best overall preload strategy?(Somewhat exaggerated example, yes.)
I would say having a preload script generated only from the root project
composer.json definition, and only with user-defined files is the best
option, if Composer is to be used for preload file generation at all.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/composer/composer/issues/7777?email_source=notifications&email_token=ABHHLFYUNB4IWBKM63AFUPTP7XBZJA5CNFSM4GCIOG4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2AZBAQ#issuecomment-511807618,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABHHLF3UJX75LEGC7N4IGLLP7XBZJANCNFSM4GCIOG4A
.
Guys, preloading requires a ini setting, so Composer won't stab you in the back either way, don't worry ;-)
I am highly skeptical that Composer is the right place to do hot-path analysis to determine the optimal set of files to preload. Activating preload requires an ini setting as @BenMorel notes, so really all Composer would be able to do is generate a script that preloads everything (which you can then use or not) or allow libraries to declare "preload these files", and then generate a script that preloads just those.
Anything more complex would be, I think, way out of scope for Composer itself. (Maybe an extension?) Let's not give poor Jordi and Nils a task of implementing autoload machine learning, k? :smile:
I am highly skeptical that Composer is the right place to do hot-path analysis to determine the optimal set of files to preload. Activating preload requires an ini setting as @BenMorel notes, so really all Composer would be able to do is generate a script that preloads everything (which you can then use or not) or allow libraries to declare "preload these files", and then generate a script that preloads just those.
Anything more complex would be, I think, way out of scope for Composer itself. (Maybe an extension?) Let's not give poor Jordi and Nils a task of implementing autoload machine learning, k? 😄
If that so, there should be no problem to add a composer preload
to generate a list to be preloaded as preload.php
. Libraries could declare which files to declare as "preloadable" under a key.
{
"preload": {
"psr-4": [
"ServerUtils\\PingTools\\",
"ServerUtils\\TracertManager"
],
"files": [
"helpers.php"
]
}
}
If a library isn't preloadable, the developer can "add" a library to the autogenerated list using the proyect roots composer.json
:
{
"preload": {
"psr-4": [
"OldPackageNotPreloaded\\",
"OtherNotPreloaded\\CommonClass"
],
"files": [
"my-app-helpers.php"
]
}
}
The developer can use this list for development for a quickstart, since vendor files shouldn't be risky to preload since these won't change. On production, the developer should ask OPCache for analytics and preload the most performance list of classes for his application.
Should be that enough? Composer will only help to make a preliminary preload list, but the final decision is sill in the developers' hand.
@DarkGhostHunter As have been pointed out multiple times in this thread, just preload everything. No need for such unnecessary configurations that will most likely end up in preloading too little anyway, thereby defeating the whole purpose.
I also agree with what @Crell said about making composer determine the autoload files. This can easily increase the complexity of Composer and should not be part of a dependency manager.
The plugin I shamelessly self-plugged is also trying to generate a preload.php
file based on the composer.json
directives. With current Opcache statistics limitations, I don't think it's even possible to reliably list and stat opcache files across different SAPIs (cli
to fpm
, etc).
@DarkGhostHunter As have been pointed out multiple times in this thread, just preload everything. No need for such unnecessary configurations that will most likely end up in preloading too little anyway, thereby defeating the whole purpose.
On a development machine it could work since there are no high RAM limitations like in a production machine.
I really like the idea of the composer.json
preloading proposal, since it allows the package developer to tell which classes or files to preload. But it makes only sense for small packages that developers write for themselves and their current project and use case, not for frameworks like Symfony, TYPO3, Drupal or Laravel, which cannot know, what the developer using the framework will actually use from their framework.
To me, the most sensible thing would be to develop a PHP package to analyse the code base of your project and generating the preload.php accordingly. This could be a composer plugin (or maybe something other that then gets integrated into a composer plugin) that will take any number of folders to check and try to resolve all the used classes and generates the file accordingly.
This solution would have to rely on composer and the autoloader though, since there is no point in rewriting an autoloader just for this use case.
This would come in handy for developers, since they would not have to care and think about what they would actually have to provide (and based on myself, not having to care about stuff is always awesome).
Any thoughts?
EDIT: I'm a bit concerned about the whole preloading stuff too, since having to restart my php-fpm
service means downtime, which might not be acceptable in some cases (even if we speak about 0.26-1.5 secs)
@CDRO maybe with the time, it will be changed and you will need to only reload php-fpm
and not restart it (so it will not kill existing requests). From the other hand if you are not accepting 0.26-1.5 secs
then probably you should have already HA (multiple servers setup), which will allow you to remove a member from the pool, restart PHP and add member again and then there is no downtime.
I though FPM reload was a graceful restart, meaning workers (that have preload data in memory) are killed and restarted once the current request has been handled through. Restart would just kill the in-progress request and return an error to the client.
If a reload does what @rask suggests, this would be indeed perfect, maybe @nikic has a better overview if it's like we expect it to be.
@pjona you're right, if this is an issue, HA should already be implemented, on the other hand this can easily be solved with deployment windows too, where it is accepted that the application might be down for some short time.
An FPM reload will not clear the preload state, you do need a full FPM restart.
@nikic I see. FPM workers receive a baseline exec env from the process manager, which is what must be restarted for the workers to receive a new exec env properly?
I'm not entirely sure it is possible to generate a useful list of stuff to preload without any execution stats, unless composer is somehow analysing the codebase and its real points of entry to see what's actually being used and what not. This is something a static analyser is probably better suited to do as some of the tooling required should already be there.
I'm currently custom-creating the preload file from real world opcache stats after letting the app rI'm currently working on run for a few days in the wild (wild meaning automated traffic as the app is still in dev). This solution does work and can potentially be automated to some extent, but it'd be a custom job each time.
I'm not convinced composer is the right tool for this particular job.
On the subject of preloading everything, and taking benchmarks above such as they are, the extra memory used by opcache can be very problematic when you're running your apps on highly promiscuous environments with very tight memory constraints. For instance, kubernetes. Any one node can be sharing a meagre 4GB of ram with 15 or 20 pods which resource limits and requirements have been tightly adjusted.
I think this is the preloading we are looking for: offer something basic, but let the developer expand on it.
{
"preload": {
"entrypoint": "entrypoint.php",
"script": ["foo.php", "bar.php"],
"directories": ["examples", "foo/bar"],
"files": ["helpers.php"],
"ignore": ["src/foo.php", "src/bar.php"],
}
This gives 100% flexibility on what to preload:
php.ini
.Preloading means editing php.ini
. The procedure should be first point PHP to include the entrypoint of the project root. That entrypoint should be handled by Composer. It will link all the preloading scripts from dependencies (or build them) into one file, which is the entrypoint, with one command:
composer preload build
That will cycle every package for a preload
key and add the scripts, directories and files (and ignored files) to a compiled real entrypoint. These are cached inside the composer bootstraping.
Ideas?
@DarkGhostHunter I'm new here and don't see under the hood of composer. But from the perspective of a mere user, this looks good to me and like something I could work with :+1:
@DarkGhostHunter I'm new here and don't see under the hood of composer. But from the perspective of a mere user, this looks good to me and like something I could work with 👍
While my suggestion will allow for automatic preloading, there is still progress to be made on preloading only the "hot" files. There should be a way to save OPCache analytics about what files are hit the most, and push a part of the list based on memory constraints or percentage threshold. If composer could do part of that job, it would be awesome
The later matters because you may preload a project with 1500 files, but you may get almost the same performance for 99% of requests with just 150 files. That you you could instance 10 more PHP instances instead of just one.
I think everybody agrees that the most optimal way to generate the preloading is to gather information via opcache and load only the files needed for the project.
IMHO, since most projects using composer will probably build their application on a deployment server and then push the app/website to the production server, composer preload
will not be able to make use of the opcache statistics (in these cases at least).
But what if it could actually access this information?
I could imagine the following solutions regarding these issues:
composer update
/composer preload
composer preload
to make a request to gather the information to build the preload scriptWould this be a viable solution?
You are welcome to keep the discussion going here as a central point for people interested in the topic to coordinate. But just to be clear, I am fairly confident that in the near future we are not going to add anything to Composer relating to preloading.
If in a year it turns out - after people have been playing with it - that there is something Composer is uniquely positioned to really help with, we can revisit. For now it seems to me much more like an application/deployment concern than a dependency management one.
I think the problem here is we are trying to preload everything.
What if we give the responsibility to package developers instead? They can declare classes/files that needs to be cached in the composer.json.
{
"preload": [
"/package/AbstractClassInterface.php",
"/package/AbstractClass.php",
"/package/helpers.php",
]
}
Package developers are responsible for these files. They should not have dependencies (or at least include it in preloading). Composer will detect these declared files and automatically create a preload script which we can optionally use.
The problem is not preloading everything. The problem is to preload what is
useful and what's not, something that is out of the scope of Composer; a
good preloading list is made by Opcache stats and a memory limit, and has
to be manually injected into php.ini.
For that purpose alone I created a package that checks your most requested
files and creates a list containing these first in descending order.
Preloading all files is an irresponsibility when you consider that, after
certain point, adding more files to the preload list will have marginal
returns in performance.
El vie., 13 de dic. de 2019 23:53, kapitanluffy notifications@github.com
escribió:
I think the problem here is we are trying to preload everything.
What if package developers can declare classes/files that needs to be
cached in the composer.json.{
"preload": [
"/package/AbstractClassInterface.php",
"/package/AbstractClass.php",
"/package/helpers.php",
]
}Package developers are responsible for these files. They should not have
dependencies (or at least include it in preloading). Composer will detect
these declared files and automatically create a preload script which we can
optionally use.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/composer/composer/issues/7777?email_source=notifications&email_token=ABHHLF6ZSVZNW3DN6N4NYOLQYRDAXA5CNFSM4GCIOG4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEG3YFOY#issuecomment-565674683,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABHHLF2GLDG5CBQ7NGUQRK3QYRDAXANCNFSM4GCIOG4A
.
Trying to see this from a simple users perspective.
I think I read trough all the comments here. Did I miss something or really nowhere this scenario was mentioned?
A user is hosting on a shared web host or a server he hasn’t much control of. And the user is using any kind of software which is using composer. Let’s imagine in each package or on root level the preloading behavior is defined. The user even don’t know about it.
Well, yes it’s true, there still is the php.ini step which prevents from an accidental activation of PHPs preloading behavior (if there is not such a thing like a ‘default link’ to composer/preload.php from PHP itself in future, which Toflar was mentioning before).
But what if a user is activating it because he found out about it in any kind of documentation, but is not aware about the full result (good and bad) of it?
How the user will be able to get the pre-loaded files out of its memory in its shared host?
In my opinion activation of preloading should urgently be based on a strong opt-in. For instance by being required to explicitly install the required code within a separate package which is more complicated and prevent from ‚by accident activations‘. Neither Composer nor a package maintained should decide this for the user without it’s acknowledge. Isn’t it?
Of course this doesn’t mean that a preloader should not use the Composer generated class map.
@NickSdot You can't fix naiveness and irresponsibility with a Composer package. I agree, but the technique to properly make an optimal preload list is beyond Composer, so any point here apart from just seeing how the preload progresses in time is moot, imo.
Most helpful comment
I ran a benchmark with a medium-size project of mine (90 package dependencies), that includes on every request a quite heavy bootstrap script that sets up classes for dependency injection; this makes it a good candidate for benchmarking class loading in real life conditions.
I benchmarked a simple page of the website, that makes very little use of the database, but alone triggers the autoloading of 380 classes. I tested the following configurations:
No preloading
I used composer's optimized autoloader:
composer install --optimize-autoloader
. The opcache was warmed up by loading the page manually.Preloading only "hot" classes
I restarted the server, loaded enough pages from the website, then used this script to generate a preload file from cached files, as reported by opcache:
This script preloads 878 files.
Preloading all the classes
I used the following preload script, that preloads the whole composer classmap:
This script preloads 14,541 files.
Results
| Benchmark | Preloaded files | Server startup time | Opcache memory used | Per request memory used | Requests per second |
| --- | --- | --- | --- | --- | --- |
| No preloading | 0 | 0.06 s | 16 MB after warmup | 1,825 KB | 596 rq/s |
| Preload hot classes | 878 | 0.26 s | 21 MB | 869 KB | 695 rq/s |
| Preload everything | 14541 | 1.56 s | 105 MB | 881 KB | 675 rq/s |
We can see interesting performance benefits whenever we use preloading:
+ 13%
when preloading everything+ 16%
when preloading only hot classesAs predicted by @dstogov, execution is a bit faster when preloading only the classes used by a given project.
Server startup time is a non-issue to me, and the overhead of preloading everything vs preloading hot classes is only 84 MB here, so negligible on modern hardware.
What's interesting also, is that using preloading (everything or only hot classes, it doesn't matter) halved the memory consumption per request!
Wrapping up
Looking at the above scripts though, they're so trivial that I'm starting to wonder if Composer should provide a preload script at all. Just copy/paste the above and you're done.
Benchmark info:
master
compiled from sourceopcache.memory_consumption=1000
opcache.max_accelerated_files=20000
Benchmarks have been run with Apache Bench, 1000 requests with 10 concurrent requests. All benchmarks have been run 50 times and the best result was used.