Emscripten: getenv: Calling getenv deallocate string previously returned

Created on 4 Apr 2016  路  4Comments  路  Source: emscripten-core/emscripten

First, thanks @kripken for the awesome work :+1:

Here are the details:

In C code where there are consecutive calls to getenv, the memory allocated after returning the first value is freed doing an other call.

The following scenario illustrates the problem:

Modified tests/env/src.c adding:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main() {

  [...]

 printf("setenv(checkpersist_VALUE1) ret: %d\n", setenv("checkpersist_VALUE1", "env_value_should_persist", 0));
  printf("setenv(checkpersist_VALUE2) ret: %d\n", setenv("checkpersist_VALUE2", "an_other_env_value", 0));
  char* checkpersist_VALUE1 = getenv("checkpersist_VALUE1");
  printf("checkpersist_VALUE1: %s\n", checkpersist_VALUE1);
  printf("getenv(checkpersist_VALUE2): %s\n", getenv("checkpersist_VALUE2"));
  printf("checkpersist_VALUE1 (after getenv call): %s\n", checkpersist_VALUE1);

  return 0;
}

Running python tests/runner.py test_env gives:

--- expected
+++ actual
@@ -33,5 +33,5 @@
 setenv(checkpersist_VALUE2) ret: 0
 checkpersist_VALUE1: env_value_should_persist
 getenv(checkpersist_VALUE2): an_other_env_value
-checkpersist_VALUE1 (after getenv call): env_value_should_persist
+checkpersist_VALUE1 (after getenv call): an_other_env_value

This is confirmed looking at the implementation of getenv in src/library.js

The following line is freeing the memory:

if (_getenv.ret) _free(_getenv.ret);

PR will follow shortly.

Cc: @thewtex

wontfix

Most helpful comment

@Matheus28 Thanks for reviewing.

Agree with you, the current emscripten implementation is conformant.

To provide some background, I first discovered the problem while working with the cpython code base. See Modules/getpath.c

After your comment, I did some more searching and ... it seems common for existing implementation to assume multiple call to getenv do not modify the underlying environment.

For example, here are some occurrences within the git codebase:

In the case of git, this was discussed on the the associated project mailing list. Back in 2013, someone reported a similar issue on the z/OS platform mentioning the current use in git wasn't conformant either . Then, after few back and forth, the conclusion was [...] this patch is useful for reaching strict conformance to the published specs, but [...] we can go back to assuming no [other] known platforms are broken and unusable because of this. Source here

Looking at the current setup.c confirms no change have been made.

Considering that the glibc implementation of getenv ( stdlib/getenv.c ) stores the environment within a static variable __environ and doesn't modify it, I suggest that we modify the emscripten implementation.

That would make the emscripten implementation non confirmant but would greatly facilitate the porting of existing projects.

Let me know what you think ?

Thanks again for your time,
Jc

All 4 comments

That seems to be conformant to the man page for getenv(3). Or maybe I'm missing something?

The implementation of getenv() is not required to be reentrant.  The
string pointed to by the return value of getenv() may be statically
allocated, and can be modified by a subsequent call to getenv(),
putenv(3), setenv(3), or unsetenv(3).

@Matheus28 Thanks for reviewing.

Agree with you, the current emscripten implementation is conformant.

To provide some background, I first discovered the problem while working with the cpython code base. See Modules/getpath.c

After your comment, I did some more searching and ... it seems common for existing implementation to assume multiple call to getenv do not modify the underlying environment.

For example, here are some occurrences within the git codebase:

In the case of git, this was discussed on the the associated project mailing list. Back in 2013, someone reported a similar issue on the z/OS platform mentioning the current use in git wasn't conformant either . Then, after few back and forth, the conclusion was [...] this patch is useful for reaching strict conformance to the published specs, but [...] we can go back to assuming no [other] known platforms are broken and unusable because of this. Source here

Looking at the current setup.c confirms no change have been made.

Considering that the glibc implementation of getenv ( stdlib/getenv.c ) stores the environment within a static variable __environ and doesn't modify it, I suggest that we modify the emscripten implementation.

That would make the emscripten implementation non confirmant but would greatly facilitate the porting of existing projects.

Let me know what you think ?

Thanks again for your time,
Jc

In general, if this is a common bug then I am ok to work around it, and behave in the more expected way. However, this means memory will leak, so we should allocate carefully (one per string, not one per call).

This issue has been automatically marked as stale because there has been no activity in the past 2 years. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant.

Was this page helpful?
0 / 5 - 0 ratings