I need to a way to access the body of a request as a raw string. Plug supports this functionality (https://github.com/elixir-lang/plug/blob/master/lib/plug/conn.ex#L418-L448) but as the docs say this can only be done once. Once the body is consumed, trying to read the body will result in an empty string.
I believe that if you use a Phoenix controller, reading the body as string is impossible. If a client sends a request with the content-type set to application/json Phoenix will automatically parse the body as params meaning any future attempts by client code to access the body will result in an empty string.
Is there any way for client code inside of a controller to access the body as a raw string?
Something like this should work:
defmodule MyApp.Router
...
pipeline :browser do
plug :copy_req_body
plug :super
end
...
defp copy_req_body(conn, _) do
Plug.Conn.put_private(conn, :my_app_body, Plug.Con..read_body(conn))
end
end
defmodule MyAppController do
...
def show(conn, params) do
the_body = conn.private[:my_app_body]
end
end
Note we don't update the conn with the read state, so it can be read downstream by the plug parsers. Let me know if that works.
@chrismccord thanks for the help. I had to modify your answer a bit (as my app is an API not a backend for a browser application). It still didn't work, let me know if my answer should work or if I messed something up:
defmodule MyApp.Router
...
pipeline :api do
plug :copy_req_body
plug :super
end
pipe_through :api
...
end
# The rest is the same
Inside of copy_req_body the body is still an empty string. However, in the log output the parameters are still populated correctly.
@rylev ah, please try this:
defmodule MyApp.Router
...
pipeline :before do
plug :copy_req_body
plug :super
end
...
end
@chrismccord that works :-) thanks! Just wondering: Is this the best way to handle this use case? I'm guessing the need for the raw body is seldom enough that this work around is fine, but could there be a nice API for optionally storing the raw body on the conn?
We should at least come up with a way to make it more performant since you don't need to read the request body on every single request. The easiest solution would be to match on the path_info. Let me get back to my desk and I'll have a better solution.
Something like this would be a better approach outside of a mechanism in plug to keep the raw body around. This dance only reads the body twice for the API stack. Not beautiful, but I think it works for this less common use-case. Please reopen if you have issues. Thanks!
defmodule MyApp.Router
...
@api_mount "/api"
pipeline :before do
plug MyApp.Plugs.CopyReqBody, mount: @api_mount
plug :super
end
scope @api_mount do
pipe_through :api
...
end
end
defmodule MyApp.Plugs.CopyReqBody do
import Plug.Conn
def init(mount: "/" <> mount), do: mount
defp call(%Plug.Conn{path_info: [mount | _rest]}, mount) do
put_private(conn, :my_app_body, read_body(conn))
end
defp call(conn, _), do: conn
end
defmodule MyAppController do
...
def show(conn, params) do
the_body = conn.private[:my_app_body]
end
end
Is this still a valid approach?
yup
I'm a little confused about :before and :super in the example:
pipeline :before do
plug MyApp.Plugs.CopyReqBody, mount: @api_mount
plug :super
end
Are these a special pipeline and plug that don't exist anymore? Or should I be including [:before] in my pipe_through calls? What's :super?
When I just add it to the existing pipelines, I get the body as an empty string, as we reported before you suggested :before.
Sorry, ignore :before and :super as they are long gone, but the plug approach is still the same. You can plug the CopyReq body above your Plug.Parsers declaration in your endpoint and it will copy the body before the parsers consume it
Ahhh. That did the trick. For future reference by the community, a bare minimum solution (in the endpoint.ex):
defp copy_req_body(conn, _) do
{:ok, body, _} = Plug.Conn.read_body(conn)
Plug.Conn.put_private(conn, :raw_body, body)
end
plug :copy_req_body
plug Plug.Parsers,
parsers: [:urlencoded, :multipart, :json],
pass: ["*/*"],
json_decoder: Poison
As stated, that doesn't let you filter on only certain endpoints and can be pretty wasteful, but it's enough of a bare minimum example that anyone should be able to work with it.
Thanks!
New problem encountered: This only works with a body size of 1024 or less (it hangs with 1025). Not capturing the body works fine with > 1024, but if we capture the body we hang when we try to parse it later.
Seems like you can configure the length and read length for Plug.Conn.read_body/2, but the defaults should be fine for this case. Is it possible that you are changing the defaults somewhere?
http://hexdocs.pm/plug/Plug.Conn.html#read_body/2
Edit: "Is it possible that the defaults are being reset somewhere."
My hypothesis is that read_body can only be called once. However, if the body is less than 1024, it's prefetched and then the read-once doesn't apply. However, once you have to fetch the next pit of the body, the first 1024 is thrown out. This causes the later read_body to timeout. Still digging through source code to confirm, but "Because the request body can be of any size, reading the body will only work once, as Plug will not cache the result of these operations."
I tend to concur with @hamiltop. I was testing something very similar (reading & storing the entire body), and was trying to get it to support when :more is returned. Once the whole body has been read (ie :ok is returned), the subsequent call hangs for a few seconds then returns a timeout.
I added the code suggested above and facing same :timeout issue as @hamiltop & @ksol. Stack trace
2015-11-11T03:40:26.862054+00:00 app[web.1]: 03:40:26.861 request_id=59bb986b-303f-4376-be5b-f4225a089321 [info] POST /stub_urls/1dd4e5b067ad90adb1cc169c
2015-11-11T03:40:41.864557+00:00 app[web.1]: 03:40:41.864 [error] #PID<0.345.0> running StubOnWeb.Endpoint terminated
2015-11-11T03:40:41.864561+00:00 app[web.1]: Server: stubonweb.herokuapp.com:80 (http)
2015-11-11T03:40:41.864562+00:00 app[web.1]: Request: POST /stub_urls/1dd4e5b067ad90adb1cc169c
2015-11-11T03:40:41.864563+00:00 app[web.1]: ** (exit) an exception was raised:
2015-11-11T03:40:41.864564+00:00 app[web.1]: ** (CaseClauseError) no case clause matching: {:error, :timeout}
2015-11-11T03:40:41.864565+00:00 app[web.1]: (plug) lib/plug/parsers/urlencoded.ex:10: Plug.Parsers.URLENCODED.parse/5
2015-11-11T03:40:41.864565+00:00 app[web.1]: (plug) lib/plug/parsers.ex:186: Plug.Parsers.reduce/6
2015-11-11T03:40:41.864566+00:00 app[web.1]: (stub_on_web) lib/stub_on_web/endpoint.ex:1: StubOnWeb.Endpoint.phoenix_pipeline/1
2015-11-11T03:40:41.864567+00:00 app[web.1]: (stub_on_web) lib/phoenix/endpoint/render_errors.ex:34: StubOnWeb.Endpoint.call/2
2015-11-11T03:40:41.864568+00:00 app[web.1]: (plug) lib/plug/adapters/cowboy/handler.ex:15: Plug.Adapters.Cowboy.Handler.upgrade/4
2015-11-11T03:40:41.864569+00:00 app[web.1]: (cowboy) src/cowboy_protocol.erl:442: :cowboy_protocol.execute/4
2015-11-11T03:40:41.854441+00:00 heroku[router]: at=info method=POST path="/stub_urls/1dd4e5b067ad90adb1cc169c" host=stubonweb.herokuapp.com request_id=59bb986b-303f-4376-be5b-f4225a089321 fwd="106.216.176.210" dyno=web.1 connect=0ms service=15003ms status=500 bytes=243
More context available on http://stackoverflow.com/questions/33637418/elixir-phoenix-on-heroku-timeout-error-after-15-seconds and https://github.com/ninenines/cowboy/issues/833#issuecomment-155656776
Source Code: https://github.com/endeepak/stub_on_web
@endeepak To be clear, this only happens to you when you try to read/copy the request body, and does _not_ happen if you use normal Plug.Parsers without the custom body copy?
@chrismccord Yes. This issue started after adding code to copy request body. Other observations which might help
The "workaround" for now would be to not copy the request body. What is your usecase for needing the raw body? Also, I'm not positive this is a real cowboy issue without digging into how plug itself chunks the body reads.
StubOnWeb is a stub http server which needs to record request and response for calls made to stubbed urls. Hence the workaround to not copy won't help. I'll dig into the code and try few stuff. Thanks for the help. If I find anything useful, I'll update here.
My use case was signed requests from github webhooks. I needed the raw body
to verify the signature. I ended up not using Phoenix for that endpoint and
just built a normal plug that halts the connection.
On Tue, Nov 10, 2015 at 9:14 PM Deepak Narayana Rao <
[email protected]> wrote:
StubOnWeb is a stub http server which needs to record request and response
for calls made to stubbed urls. Hence the workaround to not copy won't
help. I'll dig into the code and try few stuff. Thanks for the help. If I
find anything useful, I'll update here.—
Reply to this email directly or view it on GitHub
https://github.com/phoenixframework/phoenix/issues/459#issuecomment-155668867
.
My use case is actually the same as @hamiltop's. GH Webhooks are a blast! /s
@hunterboerner https://github.com/hamiltop/ashes/blob/master/lib/ashes/github_webhook_plug.ex might be a good reference for you then.
Perhaps the simplest short-term solution would be to have a custom parser(s) that copies the body?
plug Plug.Parsers,
parsers: [MyApp.Parsers.URLENCODED, :multipart, :json],
pass: ["*/*"],
json_decoder: Poison
...
defmodule MyApp.Parsers.URLENCODED do
@moduledoc """
Parses urlencoded request body, and optionally copies the raw body
Copies raw body to `:raw_body` private assign when `:copy_raw_body`
private assign is true
"""
@behaviour Plug.Parsers
alias Plug.Conn
def parse(conn, "application", "x-www-form-urlencoded", _headers, opts) do
case Conn.read_body(conn, opts) do
{:ok, body, conn} ->
Plug.Conn.Utils.validate_utf8!(body, "urlencoded body")
decoded_body = Plug.Conn.Query.decode(body)
if conn.private[:copy_raw_body] do
{:ok, decoded_body, Plug.Conn.put_private(conn, :raw_body, body)}
else
{:ok, decoded_body, conn}
end
{:more, _data, conn} ->
{:error, :too_large, conn}
end
end
def parse(conn, _type, _subtype, _headers, _opts) do
{:next, conn}
end
end
@chrismccord this was my temp fix: https://github.com/rabbit-ci/backend/blob/002b9b6daba0803aad4d4bba84c28b3827c4ba8b/apps/rabbitci_core/lib/rabbitci_core/json_parser.ex
Thanks @chrismccord, your solution works for now. I'll keep an eye on this thread for the proper fix.
Thanks @hunterboerner, I have changed JSON parser also on similar lines.
For reference, commit with this fix https://github.com/endeepak/stub_on_web/commit/47192558f501652edd8cd237a5a2430f38177ca4
Yes, you cannot read the body before because Plug.Parsers will try to read it and timeout as the client has nothing else to send. :+1: for @hunterboerner's solution!
For long term fix, it would be good if there is a solution that doesn't involve duplicating parsers code. As a elixir noob, I can think of couple approaches to achieve this
Plug.Conn to retain raw body, so that subsequent read_body can return body. Since it is optional it would not lead to memory issue concern raised in https://github.com/elixir-lang/plug/pull/308copy_raw_body_as: :raw_body to Plug.Parsers which copies raw body as private assign in Plug.ConnPlease share your thoughts
I'm also dealing with this need, as I wanted to have a Phoenix app acting a reverse proxy for some endpoints. In these cases, it would be nice to pass the raw body transparently to an upstream server.
I tend to prefer @endeepak's option 2 as this seem like a parser concern, not plug connection's.
@tjsousa We're attempting to do something similar. Phoenix sits in front of an existing application, and for some types of requests, proxies them to the app. We have it mostly working, with the exception for multipart/form-data uploads. Is it possible to get to the 100% raw body or does that require monkeying with Cowboy or something earlier in the process?
+1 to the use-case of reverse proxying to existing apps.
I'll follow the advice above and implement a custom parser. Thanks!
I can't seem to make the solution @hunterboerner came up with work either. I've copied the json parser from plug and then edited in a few places to put_private(conn, :raw_body, body) just like Hunter did. Then in endpoint.ex in the parsers option to Plug.Parsers I replaced :json with my module name Kevdog.JSONParser. Still, nothing gets written to raw_body. The field isn't present.
In the course of my troubleshooting I put endpoint.ex back to the way it was when phoenix generated it. Then I removed the json_decoder option from Plug.Parsers which (as I read the code) should have thrown an error. But it doesn't. This is making me wonder if the plugs in endpoint.ex are even running.
@blackfist are you using conn.private.raw_body or something else?
@hunterboerner yeah I am. So the next thing I tried doing was taking the json.ex from Plug, and copying that into a module with a different name and making all of the returned conns have a private field. For example:
def parse(conn, _type, _subtype, _headers, _opts) do
{:next, put_private(conn, :kevdog, "you the man now dog")}
end
defp decode({:more, _, conn}, _decoder) do
{:error, :too_large, put_private(conn, :kevdog, "you the man now dog")}
end
And then I added that to endpoint.ex like this:
plug Plug.Parsers,
parsers: [:urlencoded, :multipart, KevdogBroker.JSONParser, :json],
pass: ["*/*"],
json_decoder: Poison
Finally in my controller I have IO.inspect conn.private. But when curl to the application there is no :kevdog in conn.private.
oh hell. It looks like my problem is that I was not setting the Content-Type properly in curl. I was using curl -H "Accepts: application/json" -X POST -d '{"json": "valid"}' localhost:4000/api/github when I should have used `-H "Content-Type: application/json"
Btw, this github webook plug continues to be a good reference for those interested in reading the body:
https://github.com/hamiltop/ashes/blob/master/lib/ashes/github_webhook_plug.ex
You would use it as:
plug GithubWebhookPlug, mount: "webhook_endpoint", secret: "your key"
and that should be good to go as long as you plug it before Plug.Parsers. You can also customize the plug above for other systems, like facebook, just make to sure halt whenever you read the body, otherwise you will get timeouts when you reach Plug.Parsers.
I'm having trouble with this when implementing a stripe webhook and handing authentication. The body appears to be empty, which is needed for Stripe.Webhook.construct_event/3
https://github.com/sikanhe/stripe-elixir
Any ideas?
@ch-andrewrhyne here's a super rough implementation of github webhooks that reads the body. it's just a Plug, no phoenix, but does the job: https://github.com/hamiltop/ashes/blob/master/lib/ashes/github_webhook_plug.ex
Useful, but my issue is that I don't know how to access the raw body on the conn within my plug that I am using to authenticate stripe webhook requests. It appears that Phoenix has already parsed the body by the time my plug is invoked. Is there a way to stash away the raw body before Phoenix processes the JSON?
@ch-andrewrhyne here is the solution I use : https://gist.github.com/TechMagister/1a01b22c205309238617fb1fef1603a4
It's not perfect but it's the less intrusive way I found.
@ch-andrewrhyne did you get this working? I am running into the same issue
Sharing my solution for those working with stripe webhooks:
https://gist.github.com/atomkirk/e05fbab86f34331ffe812a1b98f63851
Sharing my solution in case someone is looking for a lite-touch way to do this. My use case was similar to others here: verifying a Shopify request. I added a plug in Endpoint that reads the body only on specific paths, and then removes the content-type header from the request. Removing the content-type header effectively skips the Parsers plug. This was fine in my scenario because I just wanted to get a signature for the raw body. If you need to parse the body, you may want to store the content-type in conn.priv so it can be accessed later.
In MyAppWeb.Endpoint:
defmodule MyAppWeb.Endpoint do
...
defp put_raw_req_body(conn, _) do
case String.match?(conn.request_path, ~r[shopify/webhooks/*]) do
false -> conn
true ->
{:ok, body, _} = Plug.Conn.read_body(conn)
conn
|> Plug.Conn.put_private(:raw_body, body)
|> Plug.Conn.delete_req_header("content-type")
end
end
plug :put_raw_req_body
plug Plug.Parsers,
parsers: [:urlencoded, :multipart, :json],
pass: ["*/*"],
json_decoder: Poison
...
end
Thanks @curthasselschwert! Note you can use path.info instead of request path and that should be quite faster:
case conn.path_info, ~r[shopify/webhooks/*]) do
["shopify", "webhooks" | _] ->
{:ok, body, conn} = Plug.Conn.read_body(conn)
conn
|> Plug.Conn.put_private(:raw_body, body)
|> Plug.Conn.delete_req_header("content-type")
_ ->
conn
end
Also note that Plug v1.6 will allow this to be done via an option to Plug.Parsers.
I have encountered this problem as well. @curthasselschwert when I try Plug.Conn.read_body(conn), I get an empty string in my body. What is this body supposed to be? The body_params? Any idea why it could be empty? I am definitely sending data and I can see it in body_params under conn.
@nitin21 the body can only be read once, and it is usually done in a Parser plug. Please see the new-ish :body_reader option on Plug.Parsers added in v1.5.1 of Plug:
https://hexdocs.pm/plug/Plug.Parsers.html#module-custom-body-reader
My problem was interesting and solved using the above information. I will state it here incase something else is going on in the guts of handling a web request.
POSTing application/xml to a controller, and doing a conn.read_body would work locally, but deployed to ECS behind ALB it would intermittently fail with ALB 502 errors. (Perhaps 1 out of 5 requests, and only when executed in quick succession with request bodies > 1k).
These failed requests would not be logged by Phoenix - even adding onrequest and onresponse Cowboy hooks did not log the request. I cannot explain this but the fix was to create an explicate Plug.Parsers for application/xml - and do the read_body in the Parser - and not in the controller.
Another strange thing I noticed, (and only behind an AWS ALB / ECS) if you POST application/xml (with no explicate MIME handler / rejection), and never do a read_body the request is never processed, ie - the client (curl), just hangs until you kill it - this is perhaps a DoS vector.
In local mode (curl to Docker/Phoenix) none of these errors could be replicated.
For future searchers, Plug now has an elegant solution to this problem which was added by this PR: https://github.com/elixir-plug/plug/pull/698 which allows you to do the following
defmodule AuditorWeb.CacheBodyReader do
def read_body(conn, opts) do
{:ok, body, conn} = Plug.Conn.read_body(conn, opts)
body = [body | conn.private[:raw_body] || []]
conn = Plug.Conn.put_private(conn, :raw_body, body)
{:ok, body, conn}
end
def read_cached_body(conn) do
conn.private[:raw_body]
end
end
# endpoint.ex
plug Plug.Parsers,
parsers: [:urlencoded, :multipart, :json],
pass: ["*/*"],
body_reader: {AuditorWeb.CacheBodyReader, :read_body, []},
json_decoder: Phoenix.json_library()
@minhajuddin this was a great end to this thread as I, too, was bit by the "verify request" problem. In your example above, what's this line doing: body = [body | conn.private[:raw_body] || []]
I ask because {:ok, body, conn} = Plug.Conn.read_body(conn, opts) returns a simple string, but read_cached_body returns a list w/ a single string (in my use-case). Should we be protecting against times when there are multiple elements in that list?
Thanks for posting this - was about to go down a very dark path!
@coladarci Plug.Conn.read_body has the following spec:
read_body(t, Keyword.t) :: {:ok, binary, t} | {:more, binary, t} | {:error, term}`
So, for larger requests this function will be called multiple times as the request is read over multiple calls to Plug.Conn.req_body, that is why are creating an iolist with the request body.
Right, but won't {:ok, body, conn} = Plug.Conn.read_body(conn, opts) blow up in the event you get a {:more, binary, t} ?
hmm, you are right it would blow up, that code might need some rework :)
I need to a way to access the body of a request as a raw string. Plug supports this functionality (https://github.com/elixir-lang/plug/blob/master/lib/plug/conn.ex#L418-L448) but as the docs say this can only be done once. Once the body is consumed, trying to read the body will result in an empty string.
I believe that if you use a Phoenix controller, reading the body as string is impossible. If a client sends a request with the content-type set to application/json Phoenix will automatically parse the body as params meaning any future attempts by client code to access the body will result in an empty string.
Is there any way for client code inside of a controller to access the body as a raw string?
If the body of a request is just JSON one could just decode the map within the controller instead of trapping it in a plug. This means it's simple and can be done in the controller.
defmodule MyAppWeb.RawJsonController do
# ...
def get_raw(conn, params) do
raw_body = params |> Jason.encode!
end
end
Kind of a waste to backtrace but if the body expected is small then it's an option.
@jochasinga - this is an entire thread around exactly that question and the answers right above your comment are still accurate.
To provide an update on where we ended up:
While the out of the box phoenix puts parsing inside the endpoint, we decided to move that parsing into our router so we can apply different parsing for different groups of routes via pipe_through.
For example, a set of routes that all need to do some stupid verification of the raw body (note @jochasinga the raw body is NOT guaranteed to be the same as re-encoding the params), we can then add our custom plug to that pipeline to save the raw body for later use:
defmodule Plugs.CacheBodyReader do
def read_body(conn, opts) do
{:ok, body, conn} = Plug.Conn.read_body(conn, opts)
conn = Plug.Conn.put_private(conn, :raw_body, body)
{:ok, body, conn}
end
def read_cached_body(conn) do
conn.private[:raw_body]
end
end
And then: pipe_through [:parsed_with_cached_body] in our router.
At which point, anywhere who has a conn (i.e for you, in your controller, or for us, in our auth plugs) can access the raw body by simply calling Plugs.CacheBodyReader.read_cached_body(conn).
It's also worth noting that there are reasons Phoenix didn't opt to keep this raw body around, so if you only need it in certain places, it's probably wise to only add this special parsing to those places.
Hope this helps!
@coladarci can you please define
@jochasinga the raw body is NOT guaranteed to be the same as re-encoding the params)
I understand this but in case of JSON they are the same. And this means you won't have to carry the raw body around because you only re-decode it once for that controller.
I wanted to include it for completeness, but I did acknowledged the plug solution and even go with it for my project.
What I meant was that the raw body has escaped spaces, newlines and the order of the keys is whatever the client sent, etc. The moment you parse it, you lose most, if not all, of that information.
For those who end up here trying to understand why their body was missing some whitespace: For me it was because I tested with curl and its -d (--data) option. I now use --data-binary and I can see that cowboy receives and reads the expected body.
Most helpful comment
For future searchers, Plug now has an elegant solution to this problem which was added by this PR: https://github.com/elixir-plug/plug/pull/698 which allows you to do the following