This issue is designed to capture the architecture and tasks required to build an extensible library for parsing URL API instructions and caching the result.
Feel free to discuss below. We'll assign specific issues for functional components as we determine them.
IImageService implementations for physical file system and remote images. IDistributedCache implementation for physical file system that can be switched out for Redis, Azure, etc. How do we handle cleanup, Response codes?The following gives an overall summary of the image processing pipeline.
IImageService is assigned using URL prefix pattern matching (Regex?)IImageService determines validity of request and attempts to return stream containing image.IDistributedCache implementation checks if file exists. (Redirect if exists?)Updated Requirements and Pipeline below
IUriParser implementation for parsing URI requests for command string that can be switched outIImageResolver implementation collection for physical file system that CommandParser implementation for converting commands into real values for processing. IImageWebProcessor for ResizeIImageWebProcessor for QualityIImageWebProcessor for FormatIImageWebProcessor for BackgroundColorIImageCache implementation for physical file system that can be switched out for Azure, etc.ILogger based loggingTagHelper for generating img, picture markupThe following gives an overall summary of the image processing pipeline.
IUriParser extracts commands; if not found, exitIImageResolver is assigned using Match() methodIImageResolver determines validity of request and attempts to return stream containing image.IImageCache implementation checks if file exists. (Returns correct response)IImageWebProcessor implementations process the imageAuthorization afaik shouldn't be ImageSharp's concern. You would plug in your authorization middleware before the ImageSharp.Web one and that would be responsible for handling anything authz-wise.
Great! I thought/hoped it would be the case.
Just adding this here for anyone wanting to get started: https://gist.github.com/clausjensen/bce60ccad6d65a92eadb4ea77229b287
@clausjensen I've actually started writing some code. Baby steps though.
Aaah awesome :) I didn't know how far you were on the basic middleware stuff but looks like you're already ahead of that!
What about multi-tenancy and access restrictions to specific set's\folders of images?
for example: products that are not "live" yet only admins are allowed to access the images
@KLuuKer That would still not be the responsibility of ImageSharp.Web - that would be something you handle in your authorization middleware. ImageSharp should simply be responsible for handling requests that have already passed through any authorization middleware that may/may not allow the request to continue being handled.
@clausjensen i can live with that
but allow to give prefixes or something for the cache keys, for multi-tenancy isolation (using the same cache)
@KLuuKer actually I think it may be a good idea to create a whole other issue just to talk about the multi tenancy stuff, as I'm pretty certain there's a lot of things in regards to how caching should/could be done and how caching can best be configured for these kind of setups.
I'm a bit worried notes and ideas like these may get lost if not separated out into an issue focused just on this part. Unfortunately it is not something I know that much about so I suspect there's a lot of potential pitfalls here we might as well get addressed to begin with. ( @JimBobSquarePants ?)
@clausjensen @KLuuKer I'll set up issues for each item later tonight.
Re caching, that's what IDistributedCache is for. Want multitenancy, switch to Azure caching or another implementation.
I think a good place to start in designing the middleware is to determine what the inputs to the middleware would be and how those come into the request (query string, headers, etc). Also what are the possible operations (is it just for resizing or are there other processing operations supported?)
It would be great if the mapping between the inputs and how they make it to the middleware was configurable so that people can use whatever routing or conventions they want and the image middleware would handle the heavy lifting.
For example, some inputs for resizing could be:
Are there any existing image services or plugins that have functionality you would like to incorporate? Are there any standards around such services?
Coming up with some sample scenarios like:
/images?w=100&h=250&q=.8
/images/100/250/.8
and walking through how the request would flow would be a great exercise. Also including caching and authentication/authorization in these scenarios will help spell out how those pieces interact.
Another big consideration in the design is making sure the response is property formatted to be consumed by a CDN. It is great to support IDistrubutedCache but ultimately CDNs are the best and most common way to cache images.
- IImageService is assigned using URL prefix pattern matching (Regex?)
Should IImageService interface something like Task<bool> ShouldProcessAsync(HttpContext context)
so the IImageService is responsible for its own implementation of URL matching.
On step 5 (Not cached? URL querystring params are parsed for matching processor.) either there is a global parser that fills some kind of _ResizeParameter Dictionary_ or again each IImageService get what he needs from the Request. It would be a solution to implement different resizing parameter conventions (headers, querystring, pathβ¦)
I think a global IImageServiceParameterProvider could be helpful to separate well the IImageService from its parameters needs.
If so the Task<bool> ShouldProcessAsync(HttpContext context) would become something like Task<bool> ShouldProcessAsync(IImageServiceParameterProvider provider) ???
Another consideration:
it would be helpfull to design the IImageServices in a chaining fashionβ¦
We could then imagine something like :
AzureStoreImageService => ImageManipulationService
This case Task<Stream> ResolveImageAsync(HttpContext context, string path) is maybe not the best interface for itβ¦
It could maybe also help to make preventive imageservice like :
PreventDdoSAttackImageService => ImageManipulationService
@lfoust
It would be great if the mapping between the inputs and how they make it to the middleware was configurable so that people can use whatever routing or conventions they want and the image middleware would handle the heavy lifting.
That feels to me like opening a can of worms. I would stick with a single unified URL Querystring API with identifiers for IImageService implementations using a prefix.
e.g. http://mysite.com/remote/http://remote-image.jpg?width=200 would resolve to the RemoteImageService since we are using the remote prefix.
If an IImageService had an empty prefix e.g. PhysicalFileImageService it would match all image requests containing recognized querystring parameters. Any implementations like this would be a last resort and only called when there are no prefix matching services.
All the inputs you mentioned would come from the Querystring API
width=200height=100quality=75format=pngxy=20&xy=50 (This is the standard method to pass arrays through qs params, though very verbose)Note IMO It's better to use more descriptive parameters as we would support more processors as plugins.
There are no standards around this as far as I'm aware.
Another big consideration in the design is making sure the response is property formatted to be consumed by a CDN. It is great to support IDistrubutedCache but ultimately CDNs are the best and most common way to cache images.
Exactly. My thought would be that use the file system IDistrubutedCache for smaller sites but if you want scaling you would use an Azure or Amazon based implementations (This is something I would probably build privately and sell licenses for - Gotta earn something from my years or work!). Those caches would be installed on a custom endpoint that the CDN will request if no matching image is found on their system.
@romain-preston
Each IImageService should definitely have something like Task<bool> ShouldProcessAsync(HttpContext context) though I think we would have already split the request URL into different parts for processing so would pass them to the method.
On Step 5. Each installed processor would accept and parse the querystring NameValueCollection and return a result accordingly. All instruction sets are querystring only. Every CDN I know allows caching by querystring params.
Another consideration:
it would be helpfull to design the IImageServices in a chaining fashionβ¦
We could then imagine something like :
AzureStoreImageService => ImageManipulationService
I don't see any benefit in this. Each service is responsible for matching a pattern and returning an image. Chaining would only add overheads and complexity.
Have a look at the current codebase. I've defined some of the interfaces and a rough guide in code + comments of the pipeline. I'm gonna need help working out the new HttpContext.Request/Response parts sorted + writing tests, creating a test website etc.
https://github.com/JimBobSquarePants/ImageSharp/tree/dotweb/src/ImageSharp.Web
@JimBobSquarePants , i see, now it makes more sense seeing the IImageWebProcessor implementation !
I'll check if i can give you a hand in the next days now that the system is almost settled.
@JimBobSquarePants been checking out out the code, and it's a good start π
Personally there's a few things I would probably design differently, but that may just be my preference as opposed to actually better!
The main thing that I would change is the way the services are created. They are currently all added on the ImageSharpMiddlewareOptions object in the UseImageSharp call. That seems like it limits you quite a lot, and means you have to pass IHostingEnvironment etc in all the APIs.
I would add these into the DI container so that you can use DI throughout instead. Then you would have an AddImageSharpMiddleware() extension method to register all the defaults, and users can easily plugin their own implementations using DI. You can obviously register the default implementations as singletons, so shouldn't be any performance differences. This is the biggest change I'd make, and it's more in-line with standard ASP.NET Core approaches. I can knock up a PR to discuss if you like π
On other things: the Commands infrastructure looks cool, but the URL/querystring matching design is very opinionated, potentially too much? I personally like the idea of essentially having a service for extracting the required parameters width=200, height=100 etc, rather than forcing them to be querystrings. For example I might want to use routing values: /resize/{width}/{height}/{path} for example.
But yeah, looking good π I've got a blog post I started as a follow up the caching one, storing files in the wwwroot which seems pretty close to what we're doing here (though less extensible). I'll try and get it finished off soon, and see if there's anything in there that's applicable here!
@andrewlock Thanks for having a look. Your feedback is fantastic I really, really appreciate it. π
Soooo.. I've implemented some changes.
First I stripped out all that IHostingEnvironment parameters. The environment is now passed through DI when I add the services. Much cleaner!
Users can implement their own implementations and configure them using the following overload in Startup (Use non-generic for default).
``` c#
public void ConfigureServices(IServiceCollection services)
{
services.UseImageSharpServices
}
Secondly I created something called an `IUriParser` that looks like this.
``` c#
/// <summary>
/// Defines a contract for parsing commands from URI's
/// </summary>
public interface IUriParser
{
/// <summary>
/// Returns a collection of commands from the current request
/// </summary>
/// <param name="context">Encapsulates all HTTP-specific information about an individual HTTP request</param>
/// <returns>The <see cref="IDictionary{TKey,TValue}"/></returns>
IDictionary<string, string> ParseUriCommands(HttpContext context);
}
This allows you to create your own dictionary of commands from the request so if you chose to. /resize/{width}/{height}/{path} would be a possible. Personally I would always use querystring params though since they're dead easy to parse.
It's looking pretty clean and it runs smoothly. π
There's still tests, a couple of built in processors, custom events to add and some 404 handling but I'm very happy with it all now.
Looking forward to any further feedback.
p.s. Your blog was invaluable today. I learnt an absolute ton about config reading through it. π»
Agree with @andrewlock on the points mentioned. There's also some limitations to what you have available when configuring (without injecting weird things around in places they shouldn't be and jumping through hoops) if you don't go with this route - so whether or not it's a personal preference (and mine too) .. I believe it's actually also the most appropriate way of initializing it π
Very nice work so far @JimBobSquarePants
@clausjensen The config stuff took me most of the day to set up, BIG learning curve! :neckbeard: I _think_ other devs should be able to configure it pretty easily in the future though though setting up their own configuration classes which is simple enough. Always open to suggestions though
@JimBobSquarePants That sounds much more conventional, glad to be of use! π The IUriParser sounds spot on π I'll have another play this afternoon and check out your handwork - top marks for turn around! π
Great! I'm totally obsessed with this now π
@JimBobSquarePants I pushed a quick update to a fork here which changes the DI design a bit - this is more what I had in mind https://github.com/andrewlock/ImageSharp/tree/dotweb
With this setup, users can easily create new implementations, and have arbitrary services injected into them. This is more the style most ASP.NET Core infrastructure is going - the "services on an options" object is a bit limiting in a lot of cases, so there's a general push away from it I think. The way I've written it, everything is just injected into the constructors.
This is just a demo, not expecting you to merge them!
There's a couple of bits I would want to think about too:
IImageService to IImageProvider or something more descriptiveKey property on IImageWebProcessor a Func<HttpContext, bool> and that would satisfy a lot of requirements. In particular, you could leverage routing, rather than effectively using a slightly hobbled version of it (prefix only) here. Options objects for the web processors. E.g. you may want to restrict the number of sizes that can be resized. That's it I think, nice work on getting it written up so quickly btw!
Aha! services.TryAddEnumerable was exactly what I have been looking for! I was trying to go that route but couldn't figure it out. Thanks!
Re your other comments.
IImageService I actually thought the very same this morning so have renamed it to IImageResolverFunc<HttpContext, bool> in IImageWebProcessor is a great idea! Much, much smarter and more flexible.I'll add an Action<IDictionary<string,string>> To the ImageSharpMiddlewareOptions class. We'll call it something like OnValidate.
In that method developers can read the commands using CommandParser and change them at will if they do not match requirements or could pose a danger.
e.g Here's what the default implementation would look like.
``` c#
///
/// Gets or sets the additional validation method used to augment commands.
/// This is called once the commands have been gathered and before an
/// Emptying the dictionary will ensure that the middleware will ignore the request.
///
public Action
{
CommandParser parser = CommandParser.Instance;
uint width = parser.ParseValue
uint height = parser.ParseValue
if (width > 4000 && height > 4000)
{
commands.Remove(ResizeWebProcessor.Width);
commands.Remove(ResizeWebProcessor.Height);
}
};
```
This, in my opinion, is super powerful and allows direct control over everything that comes in very early in the pipeline. What do you think?
Regarding the PR? It's always great to get them and looking at your code it's a great improvement. I'm a little confused by the TODO: though? Do you think it's possible, or whether they should even be optional? If someone wan't to change the default they can use the other overload no?
Excellent, this is cleaning up nicely! IImageResolver is a much better name, the lambda works well, and OnValidate looks like it will do the job nicely π
WRT the TODO note in my branch, that's due to the way I am adding services, and the difference between TryAddSingleton and TryAddEnumerable. Both overloads I've shown there will always add all the default services - that way you know the middleware always has a valid set of services - but there's not a 1st class way of users to remove the registered enumerables.
The TryAddSingleton methods work well because we only ever want a single instance of the IImageCache etc, If the user adds a registers an instance of IImageCache before calling AddImageSharp then our method will not add the default PhysicalFileSystemCache. If the user calls AddImageSharp first, the PhysicalFileSystemCache will be registered initially, but when they register a new IImageCache instance, it will effectively overwrite the default.
TryAddEnumerable is fundamentally different. It just ensures we can't register the sameResizeWebProcessor` image twice, it doesn't mean 'add this if no others have been added'.
A simple fix would be to have an overload that doesn't register these services by default. Additionally you could create a simple ImageSharpBuillder (similar to the MvcBuilder they use to configure MVC services. Then you could have a usage that looks something like this:
```C#
public void ConfigureServices(IServiceCollection services)
{
services.Use
services.AddImageSharp() //Add the default IImageCache and IUriParser
.AddResizeWebProcessor()
.AddPhysicalFileImageResolver();
services.Use<IImageCache, AzureImageCache>(); //register a custom IImageCache
services.Use<IImageProcessor, GrayscaleProcessor>(); //add an additional processor
}
```
The fluent builder style is similar to the config interface for things like EF Core and IdentityServer, so could be nice, but might be a bit overkill at this stage.
@andrewlock So I took your idea and ran with it hard! (Thanks for that π )
You can now do the following.
``` c#
// Add the default service and options.
services.AddImageSharp();
// Or add the default service and custom options.
services.AddImageSharp(
options =>
{
options.MaxBrowserCacheDays = 7;
options.MaxCacheDays = 365;
options.OnValidate = (context, dictionary) => { };
options.OnPrepareResponse = context => { };
});
// Or we can fine-grain control adding the default options and configure all other services.
services.AddImageSharpCore()
.SetCache
.SetUriParser
.AddResolver
.AddProcessor
// Or we can fine-grain control adding custom options and configure all other services.
services.AddImageSharpCore(
options =>
{
options.MaxBrowserCacheDays = 7;
options.MaxCacheDays = 365;
options.OnValidate = (context, dictionary) => { };
options.OnPrepareResponse = context => { };
})
.SetCache
.SetUriParser
.AddResolver
.AddProcessor
```
There are also factory overloads for each builder method that will allow adding services from configuration files.
I _think_ I've now covered all configuration bases with proper DI. What do you think?
Nice, that looks way nicer π The only I would probably tweak is to make the SetCache methods etc use the full AddSingleton etc methods, rather than the TryAddSingleton methods. If you're explicitly using the SetCache methods then it might be confusing for the service you register to not _acually_ be registered.
Shaping up nicely! π
Ah yes, good spot. Thanks for your help! π
What would be the behavior with regards to CDN storage or Azure Blobs. I understand that they would just act as "caches" meaning the local web application would still have to load the content and serve it, hence invalidating the CDN part of it?
A strategy I intend to implement instead, even though I would love to reuse something existing, would be to have urls generated from server side. So clients couldn't forge them. Upon url generation, the file is checked, and generated if it doesn't exist. Or it's existence is cached if it exists.
Pros:
Cons:
Thoughts?
Hi @sebastienros
I understand that they would just act as "caches" meaning the local web application would still have to load the content and serve it, hence invalidating the CDN part of it?
That's not how I would use the cache no. For scaled applications that require a CDN you would install the middleware into a single application that would be the end point for your CDN. All you client facing applications would request the images using the url + querystring from the CDN and that would only call the endpoint when it is not contained within it's own cache. You gain the full benefit of the CDN then.
CDN storage would be a pluggable cache which I plan to develop separately.
Does that make sense?
One major con I see in your approach would be load time. If you are creating the urls on the server that means for each image in your site you need to run through the entire generation pipline before the page is even rendered. That process could include http requests if you are checking for file changes. Anything goes wrong and your pages would be timing out. Url generation would even potentially slow down build generation. It's also no good for CMS applications.
Requesting images on demand and creating the urls is very common practise.
The middleware I've written already provides easily configurable mechanisms for preventing DOS attacks
Handling updated files would be an easy addition to make to the current process I think we'd alter the code here and in other cache implementations to check.
Retrieving a file via a resolver provides very little overhead. (An image would have to be loaded for processing anyway) and allows the configuration of images from multiple locations included restricted ones.
I think the middleware I've created will cover the requirements for most use cases. I've written this kind of stuff before with ImageProcessor.Web and will ensure this covers everything that does and more before final release.
Cheers
James
closing this as Imagesharp.Web is now in its own repository
See SixLabors/ImageSharp.Web#5 for a copy of the initial issue
Most helpful comment