Powershell: Simplify usage of Multipart Uploads with WebRequestPSCmdlet

Created on 29 Aug 2016  路  57Comments  路  Source: PowerShell/PowerShell

Doing multipart requests with PowerShell is quite complicated and requires several extra steps for large file uploads. An easy example for a small file can be found on StackOverflow. An in depth discussion can be found in this blog post.

It would be a huge improvement if the WebRequestPSCmdlets (Invoke-RestMethod and Invoke-WebRequest) could be enhanced so that they support Multipart messages directly.

For an implementation I would expect the following parameters:

  • MultipartFile (Path)
  • MultipartName (Name to be used in Multipart message)

.NET Core seems to have support for MultipartContent which may simplify the implementation.

Area-Cmdlets-Utility Committee-Reviewed Issue-Enhancement Resolution-Fixed

Most helpful comment

We would add an accelerator [File]

All 57 comments

This is horrendous in Powershell and hasn't moved in an age. I personally would love to see some movement here, as uploading large file is very, very difficult.

As some pointer around the topic see my comments here (albeit for PS 5.1) which like to some other sites, namely http://blog.majcica.com/2016/01/13/powershell-tips-and-tricks-multipartform-data-requests/

@markekraus Could you please look this if you have free time?

@iSazonov sure. I will take a look at it. I have seen other requests in the past few years for multipart forms in general as several IoT APIs seem to only work with multipart forms.

Feel free to ping MSFT experts here if there is compatibility issues.

FWIW, we mainly Python, Django and Apache for our Web interfaace stuff. This is all RESTfull APIs. I am not developer, just attempting to interact with the API as we are seeing more sysadmins wanting to do this.

@swinster if you could provide me an exact scenario that would be very useful. Right now, all I have to go off of is to build a out a test API that takes multipart forms and files and hope that fits most scenarios.

@markekraus sure. I work for Pexip, and the entire platform a can be automated via API calls. This specific example is required with regard to uploading an upgrade file to the platform, which will be in excess of 1GB. The API documentation can be found here - https://docs.pexip.com/api_manage/api_configuration.htm?Highlight=upgrade

I can even post here the resultant function created to achieve this, although this is based on the information post by Mario Maj膷ica in his blog above.

@swinster awesome. If I need more I'll reach out.

@swinster Can you share your function with me? i want to make sure I'm on the right path.

This presents some interesting implementation issues. To truly support Multipart form data, the Cmdlets would need to accept multiple files, a field name for each file, possibly a content type for each file, and the ability to accept a dictionary of field/value pairs at the same time. Additionally, It appears some APIs are particular about the boundary used, so an optional boundary would need to be supplied. Attempting to make that mesh with the current Cmdlets would be a significant rework almost to the point of justifying separate cmdlets for Multipart support.

The issue is in Multipart form data being extremely flexible and radically different from other requests. It would be easy to tack on support for a single file or to convert a -Body dictionary, but it would not be simple to mix a file with form data or to support multiple files. Fitting all the Multipart use-cases into the current cmdlets would touch a large portion of code to accommodate a valid but somewhat less common usage of the cmdlets.

IMO, It would also lead to a somewhat confusing UX. If -InFile was used for Multipart files and only accepted multiple files on a Multipart request it would lead to issues with users supplying multiple paths on a standard request, getting errors, and wondering why it accepts an array of paths. Also, explaining that -Body and -InFile are mutually exclusive except when doing Multipart requests would be somewhat confusing confusing. If files were split into a separate and new parameter like -MultipartFile, that might cause confusion as to why -InFile doesn't work on Multipart requests or why there needs to be a difference in the first place.

The problem I see with simplifying this is that many of the requests I've read for sending Multipart form data through PowerShell have only their own use-case in mind. It might seem simple from that perspective, but when you look a dozen or so of these requests you begin to see they almost all are distinctly different needs falling under the broad category of Multipart form data. A Narrow implementation would leave many use-cases in their current state, a broad implementation would impact a lot of code but for feature addition that is less commonly used.

With that in mind, I think a fair compromise and better approach would be to accept a -Body object/collection that can be used by the Cmdlets to create the Multipart Form-Data Request. This could be simplified by exposing new cmdlet(s) and new classes to simplify generating that object/collection. Body could simply be adapted to accept MultipartFormDataContent at least easing the burden of the user managing an HttpClient.

I could start by adding support for MultipartFormDataContent in -Body. From there it could be decided to add convenience cmdlets or possibly even wrapper classes to easy creation.

@JamesWTruher @dantraMSFT @PaulHigin @SteveL-MSFT What are your thoughts?

@markekraus FWIW, here is the modified function from Mario's blog.

The main addition was the Basic Authentication headers which also required a Base64 encoded version of the authentication credentials. Initially, I decrypted the PSCredential password via a separate function, however this was changed to a single line. The Basic Authroziation header is required as although you can provide simple credentials to the the HTTPClient (and indeed the Invoke-WebRequest), the first request is sent without this header, which is then challenged, thus a second request is made. This means that then entire 1 GB file is uploaded twice - a huge waste of time and resources.

As an aside, I have a similar issue (WRT the Basic Authorisation Header) when downloading a large file (3-4 GB), again again leads to a two request issue, and the download of the file is actioned twice), However the Invoke-WebRequest is even worse in that case as the entire file is streamed into memory and cause the remote server to lock up when I tried to use this CmdLet. I eventually changed to use the WebClient (I couldn't get the HTTPClient to work with my lacking programming knowledge) , however, WebClient doesn't implement a timeout property so you have to build a class to inherit and extend the WebClient. All very, very complex and confusing, especially to someone like me with little programming expertise. As mentioned, this is an aside to this particular issue, but is similar in nature.

function Invoke-MultipartFormDataUpload
{
    [CmdletBinding()]
    PARAM
    (
        [string][parameter(Mandatory = $true)][ValidateNotNullOrEmpty()]$InFile,
        [string]$ContentType,
        [Uri][parameter(Mandatory = $true)][ValidateNotNullOrEmpty()]$Uri,
        [System.Management.Automation.PSCredential]$Credential
    )
    BEGIN
    {
        if (-not (Test-Path $InFile))
        {
            $errorMessage = ("File {0} missing or unable to read." -f $InFile)
            $exception =  New-Object System.Exception $errorMessage
            $errorRecord = New-Object System.Management.Automation.ErrorRecord $exception, 'MultipartFormDataUpload', ([System.Management.Automation.ErrorCategory]::InvalidArgument), $InFile
            $PSCmdlet.ThrowTerminatingError($errorRecord)
        }

        if (-not $ContentType)
        {
            Add-Type -AssemblyName System.Web

            $mimeType = [System.Web.MimeMapping]::GetMimeMapping($InFile)

            if ($mimeType)
            {
                $ContentType = $mimeType
            }
            else
            {
                $ContentType = "application/octet-stream"
            }
        }
    }
    PROCESS
    {
        Add-Type -AssemblyName System.Net.Http

        $httpClientHandler = New-Object System.Net.Http.HttpClientHandler

        if ($Credential)
        {
            $networkCredential = New-Object System.Net.NetworkCredential @($Credential.UserName, $Credential.Password)
            $httpClientHandler.Credentials = $networkCredential
            $httpClientHandler.PreAuthenticate = $true
            $httpClient = New-Object System.Net.Http.Httpclient $httpClientHandler
            #$password = Get-PlainText -SecureString $Credential.Password
            $Base64Auth = [System.Convert]::ToBase64String([System.Text.Encoding]::GetEncoding("iso-8859-1").GetBytes([String]::Format( "{0}:{1}", $Credential.UserName, $Credential.GetNetworkCredential().Password)))         
            #$Base64Auth = [Convert]::ToBase64String([Text.Encoding]::GetEncoding("iso-8859-1").Getbytes("$($Credential.UserName):$($password)"))
            $httpClient.DefaultRequestHeaders.Add("Authorization", "Basic $Base64Auth")
        }
        else {
            $httpClient = New-Object System.Net.Http.Httpclient $httpClientHandler
        }

        $httpClient.Timeout = 18000000000
        #$httpClient.DefaultRequestHeaders.Add("AUTHORIZATION", "Basic YTph")

        $packageFileStream = New-Object System.IO.FileStream @($InFile, [System.IO.FileMode]::Open)

        $contentDispositionHeaderValue = New-Object System.Net.Http.Headers.ContentDispositionHeaderValue "form-data"
        $contentDispositionHeaderValue.Name = "package"
        $contentDispositionHeaderValue.FileName = (Split-Path $InFile -leaf)

        $streamContent = New-Object System.Net.Http.StreamContent $packageFileStream
        $streamContent.Headers.ContentDisposition = $contentDispositionHeaderValue
        $streamContent.Headers.ContentType = New-Object System.Net.Http.Headers.MediaTypeHeaderValue $ContentType

        $content = New-Object System.Net.Http.MultipartFormDataContent
        $content.Add($streamContent)

        try
        {
            $response = $httpClient.PostAsync($Uri, $content).Result

            if (!$response.IsSuccessStatusCode)
            {
                $responseBody = $response.Content.ReadAsStringAsync().Result
                $errorMessage = "Status code {0}. Reason {1}. Server reported the following message: {2}." -f $response.StatusCode, $response.ReasonPhrase, $responseBody

                throw [System.Net.Http.HttpRequestException] $errorMessage
            }

            #return $response.Content.ReadAsStringAsync().Result
            return $response

        }
        catch [Exception]
        {
            $PSCmdlet.ThrowTerminatingError($_)
            return $response
        }
        finally
        {
            if($null -ne $httpClient)
            {
                $httpClient.Dispose()
            }

            if($null -ne $response)
            {
                $response.Dispose()
            }
        }
    }
    END { }
}

I have also hardcoded the ContentDispositionHeaderValue value (package) which is all that was required for my needs.

@swinster thanks. I just wanted to confirm you were doing something similar.

Authorization Basic is being tracked under #4274

I'm not familiar with the other issue (using WebClient for large files). but if it is a show stopper and someone hasn't already done so, you should open an issue on it.

@markekraus We discussed using https://github.com/AngleSharp/AngleSharp with @SteveL-MSFT. We could use the package to cover most features we need including multipart.

@iSazonov so that would be part of a larger rework then to use AngelSharp for parsing and submission?

We had a Power Commitiiee conclusion https://github.com/PowerShell/PowerShell/issues/3267#issuecomment-286917402 to use AngelSharp for parsing. I believe if the start will be good we could use AngelSharp broader. Yes, it may be a lot of work so I did not start. Related https://github.com/PowerShell/PowerShell/issues/2867

I think I may still work at accepting the MultipartFormDataContent as a a -Body value. It is something that would make it accessible now and still relevant should AngelSharp prove to be used for more than just parsing.

The idea is that we're using AngelSharp library for web the same way we use Newton library for json - we exclude low level web coding and focus efforts at PowerShell web features. Now we have to implement a web client - it is very difficult to implement a full feature web client - we can't compete with browsers. In Windows PowerShell we can use IE as workaround. In PowerShell Core we need a portable solution. Otherwise, we are doomed to endless patches of "holes".
In order to not break anything, we could maintain a new fork our web cmdlets with new names (prefix) as experimental solution and test AngelSharp in the case.

Now that #4782 has been merged. It is time to consider some simplified limited implementations.

What I think can be done without much pain and without making any breaking changes is to add multipart/form-data support to dictionary -Body's and -Infile. This would be facilitated with something like -AsMultipart switch.

In the case of -InFile there would only be support for a single file and it would be added as a StreamContent. The -ContentType parameter, since it would otherwise be ignore, can be re-purposed for specifying the mime type of the file.

The -Body dictionary would be converted into StringContent. The Key's would be field names and the values would be the content.

Finally, when -AsMultipart is supplied, the -InFile and -Body dictionary can be used together for a mixed content submission.

This would add support for many basic use cases, but would not address complex multipart/form-data submissions or multiple files in a single submission. For those use cases, the MultipartFormDataContent will still need to be manually created and supplied.

This would also require some error detection, such as when something other than a dictionary is supplied to -Body when -AsMultipart is used. Also, the -ContentType will probably cause issues if it is not a valid type (will need to verify). The current logic for conflict resolution between -Body and -InFile will have to be revisited.

@iSazonov Can you add/change the label Area-Cmdlets-Utility. I'm trying to make it easier to see all the web cmdlet issues.

OK, I have been mentally planning this one for the past month and I finally have working code

https://github.com/PowerShell/PowerShell/compare/master...markekraus:MultiPartSimplification

I'm looking for feedback from anyone in this thread.

I added 2 paramters -AsMultipart and -MultipartFileFieldName. When submitting a file via multipart a fieldname is required. The logic has been reworked to allow for -InFile and -Body provided -AsMultipart is present and provided the body is either a IDictionary or MultipartFormDataContent.

The new -Body usage in #4782 is still there and doesn't require -AsMultipart but I expanded it to work with -AsMutlipart and to accept a -File (-Body is arlready blocked by the MultipartFormDataContent so you cant supply both a MultipartFormDataContent and an IDictionary).

This is all somewhat painful.. but I can confirm this works:

"Test Contents" | Set-Content -Path C:\temp\test.txt
$uri = Get-WebListenerUrl -Test Multipart
$Params = @{
    AsMultipart            = $true
    Body                   = @{Testy="testest"} 
    Uri                    = $uri 
    Method                 = 'POST' 
    InFile                 = "C:\temp\test.txt" 
    ContentType            = 'text/plain' 
    MultipartFileFieldName = 'TestFile'
}
$results = Invoke-RestMethod @Params

Should we adopt a similar syntax to curl?

invoke-restmethod -Form @{key=value;upload="@<filepath>"}

@SteveL-MSFT I thought about that, but I have run into APIs that are picky about the Content-Type used for the file uploads, particularly those that are expecting images. I guess in those cases use could still create their own MultipartFormDataContent and supply it to -Body. Then there is the question of how to escape @ in something like this irm -Form @{TwitterUsername="@markekraus"}. and how escape that escape sequence... etc.

using something like that would definitely make the internal logic easier to deal with and would make multiple file uploads available.

@markekraus I think we should optimize for general usage (80/20 rule). I like using @ as it's familiar to curl users and examples, but I'm slightly concerned about setting a precedent within PowerShell. I think escaping @ would just follow normal PowerShell rules: irm -form @{twitterusername="`@steve_msft"}

irm -form @{twitterusername="`@steve_msft"}

Wouldn't that just arrive at the cmdlets as @steve_msft and then the cmdlet would try to process the steve_msft file? @ has always been problematic in strings because it such a commonly used internet character. It was always painful in curl if you needed an @ prepended string. In any case, there would need to be string parsing of some kind added.

I too worry about that precedent.

For general usage, I think my current proposal works. Most of the requests I have seen for multipart is for uploading a single file (including the one at the root of this issue). It doesn't introduce a new syntax and it reuses parameters that users are familiar with.

Invoke-RestMethod -InFile "C:\temp\test.txt" -method POST -AsMultipart -MultipartFileFieldName Upload

@markekraus you're right, the cmdlet wouldn't know the difference. If most usage is for upload, then I would be fine with what you're proposing except that -method POST should be implied and if Upload is pretty standard, perhaps that should be default if -InFile is used.

I don't think you're going to get a ton of complaints about how you do it, as long as you don't break existing scenarios. People can put up with a little awkwardness, as long as it's documented with examples, and it works.

Having said that, I was going to say that since PowerShell is OO, it seems to me that we don't need special markers to tell filenames from strings, because we have objects -- it would make sense to me to use an object ...

I mean, from the previous example, if you had a FileInfo instead of a string ...

Invoke-RestMethod -Form @{ UserName = "Jaykul"; Avatar = (Get-Item .\avatar.png) }

... well, you would know what to do, wouldn't you?

We would add an accelerator [File]

@SteveL-MSFT I agree on the Implicit POST. I don't think multipart/form-data is ever used with any other verb and if it is. It could be Implicit based on -AsMutlipart.

@Jaykul That thought occurred to me after I signed off for the night. I'm OK with that. The other thing I thought of is that Having -Form in addition to -InFile and -Body might be a bit confusing, but, there really is no non-confusing way to go about all of this, IMO. so I guess we pick our poison.

What object type should it be detected on? I don't work on the filesystem side of the house often. If System.IO.FileSystemInfo sufficient, or is there a better one?

@iSazonov You know, [file] is something that it think has been missing from the language. That would be worth pursuing regardless of this use case, IMO. It would certainly help though:

Invoke-RestMethod -Form @{ UserName = "markekraus"; Avatar = [file]".\avatar.png" }

System.IO.FileSystemInfo is the common base class for both DirectoryInfo and FileInfo. If this feature only handles a single file then FileInfo would make more sense.

I had this thought this morning, what about also supporting streams? Stream support isn't widely used in PowerShell, but, I think it's been a missed opportunity. In PHP streams were very common for generating things on the fly like images (granted I haven't touched PHP in a really long time).

@markekraus I think streams would be additive and could be done separately from this one?

New Issues to discuss [file]/[directory] and streams we need.

For this specific implementation, it is actually simpler and trivial to add Stream support to -Form. We will already be doing type evaluation on the dictionary values and StreamContent takes a stream by default. It's actually more work, code-wise, to convert a supplied FileInfo to a stream and then to a StreamContent then it does to just wrap the supplied stream in a StreamContent. But if you think it's better to implement later I'm ok with that.

If it's simpler to do it now, there's no reason to make it more complicated later.

My tupenth, albeit your implementation talk is out of my knowledge base, when sending large file (think 1GB, indeed we regularly send 4GB files), some of my previous attempt to use PS have completely killed the sending system as the entire thing is read into memory. Streaming the file is the only sane way to do this (see previous comments). IMHO, without this ability, the whole feature falls flat.

@swinster Both the current non-multipart -InFile and the multipart implementation (whichever one we end up going with) use a StreamContent object which stream the file to HttpClient. The web cmdlets themselves are not reading the entire file to memory, but that might be what the underlying CoreFX is doing. If you have confirmed this behavior persists in PowerShell core, can you open a new issue with how to reproduce it?

@iSazonov That article is on the receiving (server->client) side were I believe there is already an open issue about it. @swinster was saying a similar problem occurs on the sending (client->server) side. I believe we are already efficient on the sending side. I think that problem for sending did exist in earlier versions, but since Core change to the HttpClient, I don't think it is an issue anymore. I have not looked deeply into the receiving side as that direction is complicated by all of the processing we do after the HttpClient call.

Removing Review - Committee, I don't think it necessary for committee to review this anymore. The current proposal seems fine.

@SteveL-MSFT which proposal? -Form or reusing the existing parameters? Honestly, I was really digging -Form

I thought this had landed on -Form? (or perhaps that's just what I thought because I prefer it?)

@SteveL-MSFT
Ok. I was just making sure. I'm good with -Form but since there were kind of multiple proposals floating about since my code post I wanted to make sure we were all on the same page:

-Form will accept a dictionary where the keys will be the field names. String values will be processed as StringContent. FileInfo and Stream will be processed as StreamContent. Any other value type will be the result of LanguagePrimitives.ConvertTo<string>() and send as StringContent. The Content-Type of files and streams will be application/octet-stream. When -Form is supplied the method will be POST and anything passed to -Method and -ContentType will be ignored . -Form will be exclusive to -Body and -InFile and an terminating error will occur if they are used together.

Note that users needing more control or advanced Multipart features may still create and supply a MultipartFormDataContent object to the -Body parameter as provided in #4782.

@SteveL-MSFT You had reapplied the Review - Committee tag to this issue. Do you get a chance to discuss this one yesterday?

@markekraus no, we ran out of time, will try to get a resolution by next week although Joey and I are at PSConf next week...

@PowerShell/powershell-committee reviewed this and agree that the Form syntax of using a hashtable makes sense as well as adding appropriate accelerators for [file] and [directory]. Since HTTP Form supports Get in addition to Post, -Method is required when using -Form.

Can you elaborate on [file] and [directory]? The discussion here mentions FileInfo and DirectoryInfo, so it's not clear which type mappings are proposed.

on that same topic... should this accelerator implementation support provider items?

[File]'somedrive:\path\to\file.txt'

if [file] -eq [system.io.fileinfo], I think that will cause confusion, e.g.

using namespace System.IO
[File]::WriteAllLines($path, $line)

Clearly that's not meant to be FileInfo, but if you don't see the using, you might be confused if you get used to see [File] in other places.

For me, the name isn't so important as just having easy accelerators for files and directories. I'm not a huge fan of [system.io.fileinfo] not throwing on the file not existing. ideally this accelerator would be used like this

function Get-Widget {
    param([PSFile]$File)
    # Do something with $File
}
Get-Widget -File 'x:\no\lo\existo.nope'
# parameter binding error for non existent file
Get-Widget -File 'c:\some\real\file.txt'
# works with existing file

$form = @{
    StringField = 'stringValue'
    FileField = [PSFile]'x:\no\lo\existo.nope'
}
# type conversion error on non existing file
$form = @{
    StringField = 'stringValue'
    FileField = [PSFile]'c:\some\real\file.txt'
}
# no error

also... I'd like to note that the accelerators should not be blocking to the multipart implementation. I can implement this now with out [file] and add support for that later. The accelerators should have been tied to a separate issue. while having them would make the UX for the -Form feature easier, for now the user can do

$form = @{
    FileField = Get-Item 'c:\some\real\file.txt'
}
iwr -Form $form -Method POST -Uri $uri

The accelerators should be a separate issue and PR. You can still use [System.IO.FileInfo] for now.

Just a quick question. with regard to this issue, has anything made it to PowerShell v6 as yet ?

Not sure if the underlying mechanism to retrieve a file will be altered, but it came to light the other day that the way I have the "working" at the moment in v5 is a little on the slow side. An HTTP download of a 4GB file from a browser from a server based on the same network takes approximately 30 seconds. In PowerShell v5, this takes around 1 hour !!!

@swinster PowerShell 6.0.0 includes the ability to supply a System.Net.Http.MultipartFormDataContent object to the -Body parameter. You can read about this in detail here https://get-powershellblog.blogspot.com/2017/09/multipartform-data-support-for-invoke.html . The simplified -Form parameter will be added in 6.1.0 (assuming nothing comes up or blocks it). It is the first feature on my TODO list, but I have a few bugs and some code refactoring I would like to resolve first.

Awesome work, thanks @markekraus . Can't wait to be able to retrieve files with -outfile

Can't wait to be able to retrieve files with -outfile

@swinster Can you elaborate on this? You can already download files and use the -OutFile parameter to save them. This has been around for several versions at least.

The blog post I linked has an earlier version of what was planned for the simplified multipart/form-data support. The planned implimentation has since been revised. You can see a "demo" of the currently planned simplified multipart/form-data support here https://get-powershellblog.blogspot.com/2017/12/powershell-core-web-cmdlets-in-depth_24.html#L21 and you can see the proposal here https://github.com/PowerShell/PowerShell/issues/2112#issuecomment-337735585

@markekraus apologies, the comment above was not in the correct place as it related to the underlying method of data retrieval from a simple GET method on the Invoke-WebRequest, which _was_ in the order of 200 time slower that a browser request, but actually nothing to do the MultiPart uploads. However, I have just run the same basic tests using Invoke-WebRequest to GET a large file in v6 and v5 simultaneously and honestly there is a HUGE difference - whoever made these changes to the implementation here need some applause as well - even if this is the wrong place!

An update for anyone following this issue. I am currently working on implementing the solution as approved by the PowerShell Committee with some minor adjustments. I hope to have a PR submitted in the coming days.

After reading RFC-7578 it became clear that a single field name can be supplied multiple times with different field values. The use case for this is an array of files (multiple file selection) or values for a single form field. To accommodate that, I'm adjusting the implementation slightly to treat collections as multiple field values for the same field name.

for example:

$Form = @{
    Files = Get-ChildItem c:\temp -File
    Names = "Bill", "Sue", "Jane"
    Cars = [System.Collections.Generic.List[String]]::new([String[]]@("Smart", "Honda"))
    Description = "Multiple and single value support."
    Image = [System.IO.FileInfo]"c:\picture\me.png"
}
Invoke-WebRequest -Method POST -Uri $uri -Form $Form

In the approved implementation, this would result in a collection being converted to a single string. In the implementation I am working on, Names would be supplied 3 times with Bill as the first value, Sue as the second, and Jane as the third. It doesn't add much complexity and initial testing shows it works for endpoints that support it. It provides more flexibility at minimal cost. After checking, I personally have an internal use case for this in a form that requires multiple values for a single field.

Was this page helpful?
0 / 5 - 0 ratings