To enable HTTP connectivity behind corporate firewalls, a number of tools and programming languages support HTTP/HTTPS proxies defined through environment variables like
HTTP_PROXY=http://proxy.com
HTTPS_PROXY=https://proxy.com
NO_PROXY="*.home.com,another.com"
Note that there seems to be no consensus on the case of these variables and all-lowercase variable names are also very common. My limited research suggest that at least the following languages automatically obtain and use a proxy from the environment:
The request
module also supports these variables, but I feel they show be respected by core http
and https
for best compatibilty.
While I do agree that this is a very common thing to want, I think it is important to point out that so are also many other things that request
implements. I think it is pretty cool of Node to allow to bypass the environment variable, so If it is implemented I think it should be optional and for backwards compatibility reasons "off-by-default".
off-by-default
I don't think this is going to work here. Imagine a CLI tool spawning a child process of node
. In that case, the user cannot reasonably provide a --flag
to enable proxy environment support. I think support should be unconditionally enabled. If one wants to skip the proxy, they can always do export NO_PROXY="*"
(or unset the variables).
...cannot reasonably provide a --flag to enable proxy environment support...
I think this is a fallacy _(don't ask me which)_ because the statement can be reversed to: The user cannot reasonably provide a --not-flag
to disable the proxy environment support.
The person that writes the request:
http.request(url, {
respectEnv: true
})
decides if this request respects the environment variable (new mode) or not (legacy).
I am pushing this because HTTP_PROXY environment variables and errors related to them are horrible to debug once you have an application running and just upgraded a node version.
In any case I would say that a default: respectEnv=false
should be treated as semver:minor.
A default of respectEnv=true
as semver:major.
Not saying this shouldn't be done, but there is a bit of a security risk inherent in this. Using an environment variable, it would be possible for a rogue module to set the environment variable discreetly causing traffic to be redirected through the proxy without the developers/users awareness.
@jasnell Pretty sure something similar could be achieved by monkey-patching http
right now. In the end, you just have to trust your modules.
Yep, as I said, I didn't say it shouldn't be done ;-) If we do it tho, we need to make sure the risks are well documented.
-1 as I think this is something that is better handled in userland
This would depend on proxy support in the agent, something which https://github.com/nodejs/node/issues/1490 looks to have been for, but was closed for some reason.
Also to quote @sindresorhus from https://github.com/sindresorhus/got/issues/79:
I strongly believe this is something that should be a part of Node.js and not every module doing HTTP
it would be possible for a rogue module to set the environment variable discreetly causing traffic to be redirected through the proxy
A Rogue module can monkey-patch all the methods in the http
module if it wants to. Rogue modules are their own separate problem that I don't think we can solve here.
-1 as I think this is something that is better handled in userland
I don't think the structure of a network which is beyond my control qualifies as a userland problem. My OS deals with any proxy I may or may not need to connect through so that each application doesn't have to deal with this. Does this example analogy not hold for nodejs core? If not, do we still want to continue burdening every module using http
with implementing this option?
Related: https://github.com/nodejs/node/issues/1490
To summarize: "Proxy support?" "Not saying no but..."
I would agree that ultimately this is something that should be handled directly within node.
Node should be read my environment variables to see that I am starting the node executable with my http_proxy
environment variable and thus I want all http requests to go through my proxy.
curl, npm, git, etc, all respect these environment variables by default. This is the purpose of the environment. The user has already specified these within their environment, so the fact that they aren't being respected is very confusing.
It shouldn't be left up to a non-node-core library developer to decide to support my proxy environment by properly configuring node supported HTTP module with my environment variables or via a configuration to be passed into their library because ultimately I may not be directly using those modules, such as in the case of using a framework that includes a library that includes a library that interfaces with the HTTP module. Thus this "userland" solution essentially creates a recursive issue through all dependencies which is much harder to solve in all cases.
Since the node HTTP library is at the core of this, it should solve this issue by respecting the environment variables used at run time and override other attempts where libraries / modules may try to set these settings itself unless some other environment variable is passed to allow for this.
@matthewwiesen The counterargument to your argument is that curl, npm and git are all end user programs, whereas node is a platform.
A better comparison is python: the builtin httplib doesn't respect http_proxy, that is left to user libraries. Python, unlike node, has a strong "batteries included" philosophy, so that is saying something.
Also: big slippery slope. Yes, curl respects http_proxy... but it also honors all_proxy, no_proxy with patterns and wildcards, and happily parses your .netrc with every connection. I don't think that has a place in core. Core is for mechanism, not policy.
Looking at the Python, it seems that the documentation does mention respecting http_proxy in regard to the core urllib.request
module:
https://docs.python.org/3.4/library/urllib.request.html?highlight=http_proxy
In addition, if proxy settings are detected (for example, when a *_proxy environment variable like http_proxy is set), ProxyHandler is default installed and makes sure the requests are handled through the proxy.
Although, this page does mention to "users" that is recommended for a higher-level http client interface such as the Requests
module, which seems to also support the environment variables:
http://docs.python-requests.org/en/master/user/advanced/#proxies
Looking at Go, it seems to be pretty similar, it seem to be supported within their Net HTTP Transport module:
https://golang.org/src/net/http/transport.go?h=http_proxy
Looking at Ruby, this seems to be supported as well:
https://github.com/ruby/ruby/blob/f845a9ef76c0195254ded79c85c24332534f4057/lib/net/http.rb#L638
Unless I'm mistaken on these sources.
I picked httplib because it's the python counterpart to node's http module. urllib is an 'open any kind of url' toolkit - useful, but without a node equivalent.
requests is actually a good example of what I mean. It's the python sister to request. request supports proxies, tunnels, auth, etc; it's the go-to package for anyone making http requests.
I don't see a compelling reason to duplicate that functionality in core when a best-of-breed solution exists and is in widespread use.
node's focus is implementing HTTP mechanism well, not policies, to enable diversity and iterative improvement of user-facing HTTP client libraries like https://www.npmjs.com/package/fetch and https://www.npmjs.com/package/request. It happens that node core http
is reasonably useable as-is, but if you hit its limitations, I suggest you use something else.
Since node's APIs cannot be changed/improved without causing great difficulties, because they can't be properly versioned, unlike userland modules, we should be very, very cautious in adding functionality. The "widely useful" test is not sufficient, it needs to be "doing it in userland causes real problems". node comes with npm included - its a great way to select the API and version of that API that you want to use that has the features you need.
So just to be clear, the proposed solution here is to:
This ultimately leads toward educating a significant sector of the Node community to HTTP proxy environments, when this is something that specific to how HTTP requests should be handled within a specific user environment.
Why should we go to every node module author to tell them that we can't use their framework/module/library because at the end of the day somewhere something is wraping the node HTTP request and that they specifically didn't support the HTTP Proxy environment because in all likelyhood they were not aware of this use case since probably over 95% of the node community doesn't have an environment like this. They likely aren't aware of running their code within a proxy environment and thus did not allow for or implement pass-thru support to allow for an end user to configure their code to ultimately configure Node HTTP to work properly in an environment where access to the external internet must go through a HTTP Proxy.
@bnoordhuis To address some of your comments from here:
I don't think proxy support in core is completely out of the question but it's a bit of a slippery slope.
Speaking from prior experience where proxy support was added to an existing product, you start with HTTP CONNECT but you end up supporting everything from SOCKSv5 to HTTP POST, basic/digest/and-so-on auth, Kerberos, NTLMv1 and v2, custom CA chains, client certificates, certificate fingerprinting, etc., etc.
There is no strict need to add it to core either because perfectly good user-land solutions exist.
I don't see how allowing HTTP proxy will suddenly open the flood gate to supporting all of those items you mention.
Regarding the SSL Certificates:
I will say that I think supporting internal CA certificates should also be supported since ultimately CA certificates themselves are already supported within Nodes HTTPS module here via the TLS module here here, which indicates that when the ca
value is "omitted several well known "root" CAs (like VeriSign) will be used", which are included via openssl by default.
The same issue as the proxy environment occurs in this scenario where with custom internal Root CA certificates when there is not an easy way to inject the internal CA certificate chains at run time via an environment variable so that the TLS module will augment its default CA and append my internal CA certs to the existing chain it already supports. But what is required today is that we necessitate that all downstream modules play nice to allow for easily hooking into and passing these configuration options through the full chain so that my HTTPS requests to sites secured with an internal CA issued certificate will validate properly. Or there is always the insecure NODE_TLS_REJECT_UNAUTHORIZED=0
work around to just not validate my internal CA issued certificates if there may be no way for me configure/pass my Root CA certificate through the various dependency trees of my frameworks / libraries which may have not thought to support this since over 95% of the node community doesn't run their own Certificate Authority.
Finally, in regard to security with respect to the "rogue module" scenario, wouldn't implementing this feature in core prevent this?
If node HTTP was by itself setting the proxy settings via the user supplied environment variable and essentially rejecting configuration options passed to it when an environment variable is available, then there would be no way for a "rogue module" siting between the end users code and the Node HTTP module to manipulate this.
I agree that supporting these changes, does seem to be breaking with the current behavior of how things work today, but :
NODE_HTTP_PROXY_ENV_VARIABLE=0
) or something of that nature.Personally, I would think that if someone where starting their application/script with their HTTP proxy environment set, then it is not unreasonable to assume that this is what they want since they configured this. If they did not want to proxy their requests, they could always do what they do now, by not setting the HTTP proxy and directly configuring this within their various libraries / modules or directly via the node HTTP lib.
I will say that I think supporting internal CA certificates should also be supported
Support for custom CAs has landed in https://github.com/nodejs/node/pull/9139. It's not released yet, but should be available in the next minor 7.x release, and it'll possibly be backported some point after that.
To be clear, your straw man procedure:
is just that, a strawman. The actual procedure is:
Also, I suggest that 95% of users are probably using request
already, so are already unaffected by this.
What exact problem did you have that led you to think Node should do this in core?
Did you not know that node encourages the use of npm modules, wrote code using http
, and your code didn't work, and that surprised you? If so, we can strengthen the docs on this.
Or did you find a third party module that was coding directly to the http
API, and it didn't work? If so, we should still strengthen the docs, and an issue should be opened with that third party module.
I don't see how allowing HTTP proxy will suddenly open the flood gate to supporting all of those items you mention.
That's simply how such things play out when a piece of software is popular. It's only a matter of time before someone requests them because "feature $x doesn't work for me because you don't support feature $y."
To answer your specific questions: SOCKS because people use it (request supports it although I don't know if it does DNS-over-SOCKS), NTLM and Kerberos because many proxies require authenticated connections.
@silverwind great to hear this is being included. Looking at #9139 this seems like good news on this front. Glad to see there is agreement there.
@sam-github The example that I am providing is that:
I would like to use a Node Web Application Framework, this application is dependent upon lots of modules. Those modules are dependent upon modules, which are dependent upon modules and ultimately we arrive at https://www.npmjs.com/package/got, which is were the we see the issue.
got
appears to directly interface with HTTP, but doesn't appear to support HTTP Proxy itself and bills itself as stripped down version of request
. However, got
appears to be somewhat popular with 3.5 million downloads over the last month, but it appears that 645 other modules are use got as a direct dependency.
Now you say that request
is depended upon 95% of the base, but npm shows it gets installed 18.5 million times per month and 19k modules are dependent upon it. While got
appears to be in the minority here, it doesn't seem that insignificant either.
So what are my options here, go to the got
authors and ask them to support HTTP Proxy?
Go to the dependencies that use got and tell them that they should switch to request
to support HTTP Proxy?
If got
ends up supporting HTTP proxy, do they do as I suggest and make this simple by looking for the environment variable (much in the same way that request does this or do they implement this in some other way to require passing configuration options through code so that I need to get modules that depend on this to support these new configuration options?
Great we solved this for got
now, but what about the next module that breaks this convention?
You see how to essentially support HTTP proxy we are forcing this to be done on every module? Or the solution is just to not use that module?
I don't see how that is a solution at all.
I certainly don't see a straw man here as since I'm not misrepresenting or exaggerating the options here.
If anything the opposing argument I see here is that if we implement HTTP proxy then it opens the flood gate, which certainly an assumption and definitely not true.
Thanks @matthewwiesen , that helps a lot to give context.
And I'm still not in agreement, and somewhat baffled. got
exists exactly because node core doesn't support:
following redirects, promises, streams, retries, automagically handling gzip/deflate and some convenience options. Created because request is bloated (several megabytes!).
Why is this one single feature (that is missing from got
), proxy support, something that should be merged into node core, and the laundry lists of features above something that should not? Where would you and @sindresorhus draw the line? I draw the line at: must be implemented in the Node HTTP layer to enable user land to efficiently implement friendly APIs on top. By which categorization, proxy, redirects, etc. are out.
Particularly when the very existence of got
is a reaction to request
having "too many" features, you can see how one person's "bloat" is another's "essential", and how much node should strive to not have any unnecessary features, so it can never be bloated.
I'd say, the bare-bones nature of node's HTTP enables fetch/request/got to all have their own opinions on what features are bloat, and what are essential. If got
thinks proxy support (or following redirects) is essential, then implement it.
got
can get together with fetch
(https://github.com/andris9/fetch/issues/23) and maybe even request
, and share the same code, code which may already exist: https://github.com/sindresorhus/got/issues/79#issuecomment-127205865
This brings me to my original request.
If a user has explicitly defined the HTTP proxy environment variable in the environment that the node processes is started, when would it ever be the desired behavior that this is not respected?
This is the main reason why I believe this should be embedded in core, so that this is enforced as per the users explicit request by setting the HTTP proxy environment variable within the shell to ensure that ALL HTTP requests made through node pass through this user's proxy since if this is not done undesirable effects will occur.
I believe this fits into the similar use case of why it is important to allow the user to define their own custom CA certificates within an environment variable and I agree with the point you made here.
Ultimately this is necessary in core to ensure that node HTTP behaves in a way that is conducive to the corporate/enterprise environment in a consistent/enforceable fashion, so that the HTTP Proxy is honored regardless of downstream module authors since they may not support this because they are not familiar with proxy environments which ultimately leads to their modules not behaving correctly within the corporate environment where a proxy is essentially standard. This will ultimately mean that requests they expect to function normally will be blocked because they are not routed through the HTTP Proxy as defined by the user's environment variables.
The ultimate desired effect from the user's perspective is that there exists a need for ALL HTTP requests honor the proxy environment since it is undesirable where when the HTTP Proxy environment variable is set and is not enforced as this leads to undesirable behaviors with the HTTP environment where requests will not function unless all downstream modules that interface with Node HTTP do so where the proxy environment is ultimately configured, which begs the question as to why would we not want this in core?
It is because of this which is why I propose the HTTP proxy to be included into core so that this is the default effect when a user supplies the HTTP proxy environment.
As for all other HTTP customization/configurations these are not relevant to core.
If the user did not want to enforce the HTTP Proxy in this proposed way, it would be reasonable that the user not define their HTTP Proxy environment variable and leave this unset.
Edit:
Fixed some spelling/typos being on mobile.
@nodejs/ctc needs more input. I'd be interested in knowing from the authors of request (@mikeal, @simov ), fetch (@andris9), and got (@sindresorhus).
I'm half convinced there should be a canonical github.com/nodejs/http-proxy or something of the like that can be used by down stream HTTP client APIs, still not so convinced that it should be in node core, would like to hear more from implementors.
Just to be sure, did you had node-fetch in mind instead of my fetch which is some pretty old code and not used so much?
I don't know a lot about HTTP proxies but I have added proxy support to Nodemailer for SMTP connections (docs here). Basically I added a new method getSocket
that by default opens a new socket against provided host and port but can be overriden to provide an existing socket from somewhere else.
I did indeed mean that, sorry. /to @bitinn and @TooTallNate because he implemented https://github.com/TooTallNate/node-https-proxy-agent
As the original issue submitter of #1490 (global proxy support) and developer of node-fetch
, it was me that ultimately drop the case for built-in environmental proxy support.
I opted for this:
But I think an alternative would be to advocate user-land module to expose custom Agent passing. Say somewhere in the documentation.
I am not sure how far along we are on this documentation update?
I think the node community ultimately gravitate towards a few well-known userland solutions when doing HTTP requests, adding proxy support to those isn't particularly difficult as long as those solutions support Agent
.
I can understand some developers, particularly those dealing with corporate firewall or creating client-side node app, would like a simple built-in support in core to achieve their goal. I just don't think it will be fine-grain enough for other use cases. For example, myself, which use a SOCKSv5 proxy, I _know_ most platform/environment/app doesn't support that.
So consider me a -0 on this case, I no longer think this is a good idea for my own use case.
As for userland issues, like Joker says, _all you need is just a little push (request to the userland solution you want proxy support)_.
(EDIT: my network isn't great, partial comment now fixed)
I did indeed mean that, sorry. /to @bitinn and @TooTallNate because he implemented https://github.com/TooTallNate/node-https-proxy-agent
For example, myself, which use a SOCKSv5 proxy, I know most platform/environment doesn't support that.
I'm not entirely sure how https-proxy-agent got involved here, but as far as SOCKSv5 support goes, you probably want to use the more generic proxy-agent
module which _does_ support connections to SOCKS servers.
Not sure why this discussion seems to be gravitating towards SOCKSv5 since this is a general proxy to proxy TCP connections between client/server where the this acts on any network protocol on any port.
This is mainly focused on HTTP proxy which is specific to HTTP only and HTTP is supported from Node core.
As one more follow up, while there exists modules such as proxy-agent
from @TooTallNate and node-fetch
from @bitinn, this issue is not specifically targeting those modules, but rather the developers who use those modules.
Looking at proxy-agent
it defines the following from its sample use case:
var proxyUri = process.env.http_proxy || 'http://168.63.43.102:3128';
var opts = {
method: 'GET',
host: 'jsonip.org',
path: '/',
// this is the important part!
agent: new ProxyAgent(proxyUri)
};
http.get(opts, onresponse);
Now, if I was directly using the proxy-agent
module, I wouldn't have this issue since I would just configure it properly myself, but if the developer who is directly interfacing with one of the above modules and is creating a web framework or some other library and they package up their module and don't specifically support the HTTP proxy then I, the end user, attempt to use this packaged module we can see that my proxy environment variables will not be honored and thus their module will not function as they expect it to within my environment.
So you see where now to correct this proxy issue, we have to go further downstream since there isn't a way that the end user's HTTP Proxy environment variables to be directly configured within Node HTTP so that as an end user I can be sure that ALL HTTP requests through a node process honor my proxy environment.
I think this should get feedback from @nodejs/ctc .
Would also be good if it supports specifying SOCKS proxy in env.
@stevenvachon proxy-from-env is not a solution to the problem described here. The problem is that the user who is behind a HTTP Proxy is executing other developers code which may not be aware of the HTTP Proxy and thus it fails.
Here is a good example of what I am talking about:
Someone publishes a new NPM module that essentially checks to see if a Website is available over HTTP and as a user, I write my own code to use that module such as this:
var isitdownorjustme = require('isitdownorjustme');
var response = isitdownorjustme('http://www.google.com');
This module works great for anyone not behind a HTTP Proxy, because the module directly calls out to those sites, but this module did not implement its own HTTP Proxy awareness via one of the many modules as listed in this thread, so now when a user who is behind a HTTP Proxy attempts to use this module it will always fail to function in the intended manor as the module author may not be aware of what environments the module is being used in.
So as a user, I set my own HTTP_PROXY environment variables and execute my simple script, however, node itself does not automatically detect how the user intends HTTP requests to be routed through HTTP Proxy variables, so the user has no recourse to ensure that this module will function within their environment.
The proposal listed in this issue is to allow for node to behave in the way as shown above so that module authors do not need to explicitly support HTTP Proxies as the user can set these variables at execution time through the environment if not already specified via configurations to modules that are HTTP Proxy aware.
@sam-github Is there any other feedback from @node/ctc?
@matthewwiesen I wasn't implying that it was an end-all solution, but rather a means of detecting the proxy more completely than most would think to do, and to apply either to core or to instances of http.request()
via something like tunnel.
@matthewwiesen No, I labelled it for agenda.
Seems to be a feature bloat to me.
Let me elaborate on this a bit: rather than introducing proxy into the core, I'd like to suggest introducing API hooks to be able to set the proxy implementation from the user-land.
@indutny I'm all for having a way for this to be implemented to allow some overriding mechanism to be available to the user.
I guess I would want to know more about how this proposed hook would work.
At least to me, when I hear hook, my first thought is this sounds like this would allow the user to have a programmatic ability to dynamically change this behavior at runtime in some way. If that is the case, this sounds interesting, I'd be curious to understand more about how this would work.
Or is this something like the user maybe being required to set an additional environment variable to ENFORCE_HTTP_PROXY=1
similar to how we can override node's default behavior for TLS Verification as I mentioned previously in this post.
@indutny How would these API hooks allow me as a user of modules that may or may not have proxy support, enforce that the modules respect my HTTP_PROXY settings?
If this doesn't make it into v8.0, Node.js will remain broken for years more.
Will chime in with another example.
Our internal GitLab CI environment spins up a Docker instance using node:latest, clones our project and runs through our script, which the first step is running npm install
to install all of our npm packages required for the application.
This works great, up until you try to install phantomjs-prebuilt
, where installing this modules basically spins up a node process itself to fetch a packaged tar.bz2
from GitHub to install this packaged version. This node process fails to fetch this URL because the Proxy Environment is not honored by node at run time via the Environment variables we have configured.
As you can see below, this returns an error statusCode of 407
, which relates to HTTP Error 407 Proxy authentication required
Except from npm install:
...
npm info lifecycle [email protected]~install: [email protected]
> [email protected] install /builds/<group>/<project>/node_modules/phantomjs-prebuilt
> node install.js
PhantomJS not found on PATH
Downloading https://github.com/Medium/phantomjs/releases/download/v2.1.1/phantomjs-2.1.1-linux-x86_64.tar.bz2
Saving to /tmp/phantomjs/phantomjs-2.1.1-linux-x86_64.tar.bz2
Receiving...
Error making request.
Error: tunneling socket could not be established, statusCode=407
at ClientRequest.onConnect (/builds/<group>/<project>/node_modules/tunnel-agent/index.js:165:19)
at Object.onceWrapper (events.js:293:19)
at emitThree (events.js:116:13)
at ClientRequest.emit (events.js:197:7)
at Socket.socketOnData (_http_client.js:443:11)
at emitOne (events.js:96:13)
at Socket.emit (events.js:191:7)
at readableAddChunk (_stream_readable.js:176:18)
at Socket.Readable.push (_stream_readable.js:134:10)
at TCP.onread (net.js:554:20)
Please report this full log at https://github.com/Medium/phantomjs
npm info lifecycle [email protected]~install: Failed to exec install script
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@^1.0.0 (node_modules/chokidar/node_modules/fsevents):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for [email protected]: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"x64"})
npm ERR! Linux 3.10.0-514.el7.x86_64
npm ERR! argv "/usr/local/bin/node" "/usr/local/bin/npm" "install"
npm ERR! node v7.7.1
npm ERR! npm v4.1.2
npm ERR! code ELIFECYCLE
npm ERR! [email protected] install: `node install.js`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] install script 'node install.js'.
npm ERR! Make sure you have the latest version of node.js and npm installed.
npm ERR! If you do, this is most likely a problem with the phantomjs-prebuilt package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR! node install.js
npm ERR! You can get information on how to open an issue for this project with:
npm ERR! npm bugs phantomjs-prebuilt
npm ERR! Or if that isn't available, you can get their info via:
npm ERR! npm owner ls phantomjs-prebuilt
npm ERR! There is likely additional logging output above.
...
This is something that should be supported by node so that node will honor proxy environments properly.
I'm not sure why this was removed from ctc-agenda
.
From the aboved references and reviewing the live streams, it doesn't appear that anything was discussed?
Was this discussed at some other point in the meeting @targos?
Supporting this feature is critical in enterprise environments that are behind corporate proxys.
I'm not sure why this was removed from ctc-agenda
It was discussed in the January 25 meeting, see https://github.com/nodejs/CTC/issues/63 and the meeting minutes.
@bnoordhuis Thanks for sharing, for those wanting to review the discussion that took place, here is a link to when this starts:
It seems that the discussion taken place seems to surround user modules and a proposed API hook is described generally a bit. While these API hooks may allow a user / coder to override the http agent in a way to enforce their HTTP PROXY environment, how would this solve the most recent use case that I've provided.
In this case, node itself is being spun up outside of the user's control via npm install
within a CI environment. While I have exported the HTTP PROXY enviornment variable within the CI enviroment to make npm install
function correctly by downloading modules through our proxy server. The issue here is that one of these modules itself has a built in node script to attempt to download a tar.gz
bundle to install necessary items. I don't see how I as a user could enforce such HTTP PROXY environment into this executed node script other than via the HTTP_PROXY environment variable (which is provided) since even if an API were to be provided since this seems like it would require programmatic access within the node script that is being executed which in this case is something that is owned by phantomjs
.
It still seems that the only way to truly support this is that node itself should honor HTTP PROXY environment variables, otherwise the user has no recourse to enforce this in the many ways that this issue will surface itself.
Any feedback / updates on how the proposed user-land API fix would work under a scenario where node itself is spun up outside of the user's control, but however, is spun up within the user's environment...
That's no different from a httplib-based python program: it's up to the programmer to make it work the way the end user wants. The platform should provide the building blocks, it doesn't need to come prefab.
@bnoordhuis I'll remind you that in this use case, I'm not a programmer.... I'm attempting to use other peoples node modules and THOSE developers are not supporting HTTP Proxy, which is a requirement in my environment.
So either Node.js must support this or I will have to go to every single node module author and open issues with them so they can modify their code to support the HTTP Proxy since their code is expecting to connect to the internet, but is not able to in my environment due to corporate security controls / restrictions.
I've already outlined these points earlier in this thead.
Then you are an end user. It's not node's responsibility to cater directly to your needs, that's the module author's job.
Solving this issue at every node module is not a scalable solution.
@cjk Whats with the downvote? I'm open to hearing more about constructive feedback on how to solve this problem. Feel free to provide your view point on how to solve this problem.
It is unreasonable to necessitate that node module developers to support the HTTP Proxy environment since it requires way too much effort to go around the entire node module community informing / educating developers who are unaware or Proxy environments to support them.
The easiest solution is that configure node itself with an overriding mechanism which allows the end user to configure node itself with the HTTP_PROXY environment variables so that HTTP requests are actually proxied as the user intends them to be... otherwise the user wouldn't be specifying this variable in the first place.
I don't see how implementing such a configuration option would break functionality of node.js. I'm also open to hearing more about this aspect since while there is much decent surrounding this, I don't believe there has been concrete examples of what kinds of impacts this would bring.
It feels like this issue is not progressing. But actually I can see a way how someone could push this issue!
1.
. _(Just reflecting the opinion of the TC regarding proxies in the Node.js docs)_1.
could become part of Node.js core in order to reduce the implementation cost for current and new packages.3.
in case someone is stuck with user-land code that doesn't implement proxies.@martinheidegger I would agree with most of this, but I don't think that having just a flag to node will be enough. I've demonstrated above a use case where just running npm install phantomjs-prebuilt
executes its own node script so a user has no way to specify any flags to node, other than via the environment variables in which the node script is executing in.
@matthewwiesen One step at a time?
As a note, I've reviewed the phantomjs-prebuilt closer and it looks like their code crafts the proxyURL via the NPM Config environment variable which is then used in the request
module:
We don't set this since NPM itself honors the HTTP_PROXY environment variables:
So as you can see there are so many inconsistencies with how proxy environments are either supported or not supported among the various node_module developers, which is the frustrating piece when you are a user sitting behind a corporate proxy environment.
It would just be nice if there was one way to ensure that the HTTP Proxy was allowed to be overridden where developers such as this don't seem to support the defacto standard way to support proxy environment variables... It just seems that because this doesn't exist it leaves so much up to various interpretation and places the burden on node_module developers to support something that the user should be able to specify directly to node at runtime.
Just chiming in here – as a user behind a corporate proxy with its own internal npm registry it's a nightmare installing any internal npm packages that have dependencies that perform HTTP requests in their postinstall scripts (e.g. electron
, phantomjs-prebuilt
etc). Almost invariably, these dependencies haven't taken the steps to inherit the HTTP_PROXY etc env vars, so the install will fail.
Most package authors (myself included, until now) don't generally consider users behind proxies when performing an HTTP request, so automatically inheriting the environment configuration seems like the sane default option to me.
If you're one of the 0.1 per cent of Node users who is writing a low-level HTTP library that needs to tinker around with proxy settings, you're almost certainly capable of passing an option to disable the automatic proxying when calling the Node HTTP API functions.
Please enable automatic support for the HTTP_PROXY env vars, it's already bad enough behind a corporate firewall without the constant worry of node modules not installing properly!
There's also detecting when the OS' proxy has changed. Environmental variables cannot do this, but something like os-proxy might. These are non-trivial things that each project should not be responsible for.
@stevenvachon That's not something node.js would do for you, even if it otherwise supported proxies, because it would mean taking on a slew of system dependencies. Same reason it doesn't detect timezone changes or DNS server changes (right away anyway.)
@indutny and @martinheidegger proposed ways forward in https://github.com/nodejs/node/issues/8381#issuecomment-275158505 and https://github.com/nodejs/node/issues/8381#issuecomment-288024753 but no one seems to have taken them up.
Seeing how there seems to be no movement and no broad buy-in from collaborators, I move we close this issue.
Just because there's no movement doesn't mean that the community agrees that this issue is worth closing.
But if collaborators do, then that is what is going to happen. I'll give it a few more days; if no collaborators speak out against, I'll close it out.
I still think that this is something that needs to be in core. There are many HTTP frameworks out there that will probably never support proxies.
The bloat argument seems weak, it's just one env lookup on startup, we're not asking for more. Also, with just that single lookup the security argument of a rogue module changing the proxy becomes naught.
Maybe someone should step up and provide a implementation in a PR?
I recommend using proxy-from-env as it's very complete.
The bloat argument seems weak, it's just one env lookup on startup
With all due respect, that statement betrays a certain lack of understanding when it comes to the complexities of HTTP proxying. Yes, one environment variable check - and then what? Pixies come and carry the request across the proxy?
I won't bore you by reiterating all the issues one has to deal with, I'll just link to https://github.com/nodejs/node/issues/1490#issuecomment-165065624. :-)
I'm going to advise against this.
I still think that this is something that needs to be in core. There are many HTTP frameworks out there that will probably never support proxies.
Everyone who uses request has support for proxies and these env variables.
In fact, npm initially had its own http client and moved to request almost entirely for this purpose and is probably the only reason they've stayed on request.
I can't think of a single feature in request that has been harder to maintain than proxying over TLS https://github.com/request/tunnel-agent
The semantics are poorly defined at best and it took years to nail down what client implementations actually expect in every scenario. Moving this into Core would be a huge mistake. Bugs are very hard to track down and the entire implementation rests on the Agent implementation which is one of the worst parts of Node.js.
We have pretty good support for this in request, and npm is a good vector for testing and compliance because of how large the user base is, but it's not in the kind of shape that we'd want to stick it in Core and if you're considering writing another implementation I'm skeptical it will be compliant with what is out there in the wild without a few years of heavy testing.
Yes, one environment variable check - and then what?
I meant from a runtime performance cost perspective, not actual implementation, which is likely a nightmare, yes. Thought I'm wondering how much of a pain to maintain this really is for golang or python.
@mikael if it is implemented following https://github.com/nodejs/node/issues/8381#issuecomment-288024753 then I am not sure where the harm should be. Having as solid support for proxies in node as possible, done right should be an approvable goal, no?
@mikeal While "most" node module developers may be using request
, not all are. There are plenty of other libraries that directly interface with node's http
/ https
core modules. So this can't be solved by saying everyone should be using request.
Even if people did utilize request
, the fact is that node modules can override these proxy settings and thus leave the end user no way to override this.
I've seen a similar issue pop up relating with SSL Certificates. Node.js implements the following environment variables for the user to augment Nodes default certificates list:
And yet, it explicitly states that if a ca
option is directly utilized with the TLS or HTTPSClient module it is overriden. I've run across instances where as an end user, attempting to configure node with my private / internal CA cert, some node module author is somehow overriding this through their own usage of request
and thus I have no way to get this to work correctly.
And yet, it explicitly states that if a ca option is directly utilized with the TLS or HTTPSClient module it is overriden.
This sounds like something we should fix. Care to open a separate issue?
@silverwind Created #14705 to track that separately of this.
any updates? :D
Since this issue has become hard to oversee, is it okay if I open another issue that lists https://github.com/nodejs/node/issues/8381#issuecomment-288024753 as a step-by-step guide for possible contributions/contributors?
@martinheidegger Yes, good idea.
I'll close out this issue. As you say, it's become rather long and meandering.
Opened #15620 as a follow-up.
Not sure if anyone will see this, but I wanted to chime in that the Node ecosystem of HTTP modules has become rather obnoxious. Writing a React app from behind a proxy has become a real pain.
Many popular modules have hard-coded dependencies upon other HTTP modules, such as cross-fetch
, isomorphic-fetch
, axios
, etc. Those modules, more often than not, do not support HTTP_PROXY
at all, or if they do, don't support no_proxy
, etc.
So, what to do in this case? Hit up every module author, and hope that somehow they "eventually" get around to fixing the issue? Install some temporary workaround that someone published to npm
because the module has effectively become abandonware? And what to do when the workaround itself becomes abandonware?
I've spent several days chasing down red herrings and dead ends, trying to find modules that support isomorphic HTTP support for proxies. While yes, Node engineers will ultimately disclaim responsibility for "userland concerns" such as proxies, the result is that Node itself becomes a shining paradigm of austerity surrounded by steaming piles of module garbage. Congrats on that accomplishment!
I made this same point a year ago in this thread. Not much has changed.
https://github.com/nodejs/node/issues/8381#issuecomment-265240434
On Thu, Feb 22, 2018 at 11:09 AM Dejay Clayton notifications@github.com
wrote:
Not sure if anyone will see this, but I wanted to chime in that the Node
ecosystem of HTTP modules has become rather obnoxious. Writing a React app
from behind a proxy has become a real pain.Many popular modules have hard-coded dependencies upon other HTTP modules,
such as cross-fetch, isomorphic-fetch, axios, etc. Those modules, more
often than not, do not support HTTP_PROXY at all, or if they do, don't
support no_proxy, etc.So, what to do in this case? Hit up every module author, and hope that
somehow they "eventually" get around to fixing the issue? Install some
temporary workaround that someone published to npm because the module has
effectively become abandonware? And what to do when the workaround itself
becomes abandonware?I've spent several days chasing down red herrings and dead ends, trying to
find modules that support isomorphic HTTP support for proxies. While yes,
Node engineers will ultimately disclaim responsibility for "userland
concerns" such as proxies, the result is that Node itself becomes a shining
paradigm of austerity surrounded by steaming piles of module garbage.
Congrats on that accomplishment!—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/nodejs/node/issues/8381#issuecomment-367730529, or mute
the thread
https://github.com/notifications/unsubscribe-auth/ALCUGOBd8t08Mc_Eceb3Ffqlr2lF7z5Vks5tXZE-gaJpZM4Jzrrf
.
@dejayc + @matthewwiesen I understand your frustration and feel for your need to vent, but there is a path laid out for people wanting to contribute: https://github.com/nodejs/node/issues/15620
@dejayc If you have (or can somehow get) socks5 proxy, you can workaround this with tsocks unless you're on windows then it's much harder (I don't really recall but it was somehow possible with WideCap).
Anyway, it would be really great if there was a support in node.js for http_proxy
env var or at least some globally installable module which does monkey-patch http (I don't know about any yet and I don't have time doing it so tsocks is fine for me now)
Why is this closed?
I still have proxy issues with node 10 on windows.
Because it's not Node's problem, from what I understand.
@AndyOGo @dejayc See https://github.com/nodejs/node/issues/15620
Would be a "nice to have". It's respected by request but there are a fair amount of other modules for making http requests out there 🤷♂️
For Node.js v12 and above you can use https://github.com/gajus/global-agent.
For older version of Node.js you can use https://www.npmjs.com/package/global-tunnel-ng.
So... does node-fetch
support no_proxy or not?
Most helpful comment
I would agree that ultimately this is something that should be handled directly within node.
Node should be read my environment variables to see that I am starting the node executable with my
http_proxy
environment variable and thus I want all http requests to go through my proxy.curl, npm, git, etc, all respect these environment variables by default. This is the purpose of the environment. The user has already specified these within their environment, so the fact that they aren't being respected is very confusing.
It shouldn't be left up to a non-node-core library developer to decide to support my proxy environment by properly configuring node supported HTTP module with my environment variables or via a configuration to be passed into their library because ultimately I may not be directly using those modules, such as in the case of using a framework that includes a library that includes a library that interfaces with the HTTP module. Thus this "userland" solution essentially creates a recursive issue through all dependencies which is much harder to solve in all cases.
Since the node HTTP library is at the core of this, it should solve this issue by respecting the environment variables used at run time and override other attempts where libraries / modules may try to set these settings itself unless some other environment variable is passed to allow for this.