Chapel: Improve support for valgrind+gasnet testing

Created on 21 Feb 2018  路  4Comments  路  Source: chapel-lang/chapel

We do nightly single-locale valgrind testing, and that's been incredibly valuable for finding and fixing invalid memory accesses (use-after-free and those sorts of things.)

Unfortunately, multi-locale valgrind testing isn't clean, so it makes diagnosing failures difficult. We should try to clean up multi-locale valgrind testing and run at least a subset of the test suite nightly. For motivation valgrind+gasnet testing would have caught the issue that https://github.com/chapel-lang/chapel/pull/8524 fixed immediately instead of the months it took us to stumble upon it.

I don't think we can expect gasnet+valgrind to work for every configuration, but I'm hoping we can at least get things working for the udp w/ local spawning and smp substrates.

We may also have better luck using an address sanitizer (#6158)

For future reference I found running valgrind manually with --track-origins=yes to be helpful for trying to find the origin of the bad memory access.

I tried getting smp and udp conduits working, but I ran into some issues. I timed out trying to get things working, so I'm just noting what I found for future reference:

TL;DR:

Here's how to try with smp:

source util/cron/common-valgrind.bash
source util/cron/common-gasnet.bash
export CHPL_COMM_SUBSTRATE=smp
export CHPL_MEM=jemalloc
export CHPL_RT_MAX_HEAP_SIZE=2G
export CHPL_RT_CALL_STACK_SIZE=1M
export GASNET_SUPPRESS=$CHPL_HOME/third-party/gasnet/gasnet-src/other/contrib/gasnet.supp

chpl test/release/examples/hello.chpl -g
valgrind --tool=memcheck -q --trace-children=yes --suppressions=$GASNET_SUPPRESS --track-origins=yes ./hello -nl2
# should be clean (but take ~60s), hello4 will report issues though

Here's how to try with udp:

source util/cron/common-valgrind.bash
source util/cron/common-gasnet.bash
export GASNET_SUPPRESS=$CHPL_HOME/third-party/gasnet/gasnet-src/other/contrib/gasnet.supp
# maybe apply diff below

chpl test/release/examples/hello.chpl -g
valgrind --tool=memcheck -q --trace-children=yes --suppressions=$GASNET_SUPPRESS --track-origins=yes ./hello -nl2

SMP:

I was able to hack gasnet-smp into somewhat working. CHPL_COMM_SUBSTRATE=smp requires segment fast/large, which means we currently need CHPL_MEM=jemalloc. Technically that should work for now, but jemalloc 5 removed valgrind support, so that won't work once we upgrade jemalloc versions.

That config seems to work with valgrind for hello.chpl, but teardown takes a really long time and consumes a lot of memory. My guess is that valgrind is tracing the entire registered heap for each locale and that process is slow. Setting CHPL_RT_MAX_HEAP_SIZE=2G allows hello.chpl to finish in ~20 seconds (takes 10s of minutes without lowering the heap size)

hello4 does not run cleanly. We see errors along the lines of:

==43335== Syscall param write(buf) points to uninitialised byte(s)
==43335==    at 0x5BD545D: ??? (in /lib64/libc-2.22.so)
==43335==    by 0x5B6B83F: _IO_file_write@@GLIBC_2.2.5 (in /lib64/libc-2.22.so)
==43335==    by 0x5B6AE62: new_do_write (in /lib64/libc-2.22.so)
==43335==    by 0x5B6C6A4: _IO_do_write@@GLIBC_2.2.5 (in /lib64/libc-2.22.so)
==43335==    by 0x5B6BD60: _IO_file_xsputn@@GLIBC_2.2.5 (in /lib64/libc-2.22.so)
==43335==    by 0x5B61C4C: fwrite (in /lib64/libc-2.22.so)
==43335==    by 0x4FC799: qio_fwritev (in /data/cf/chapel/eronagha/chapel/hello4-datapar-dist_real)
==43335==    by 0x4FCD5E: _qio_buffered_behind (in /data/cf/chapel/eronagha/chapel/hello4-datapar-dist_real)
==43335==    by 0x50297F: _qio_channel_flush_qio_unlocked (in /data/cf/chapel/eronagha/chapel/hello4-datapar-dist_real)
==43335==    by 0x51095B: qio_channel_write_newline (in /data/cf/chapel/eronagha/chapel/hello4-datapar-dist_real)
==43335==    by 0x4D8414: on_fn_chpl23 (IO.chpl:4002)
==43335==    by 0x4D882C: wrapon_fn_chpl23 (IO.chpl:3998)
==43335==  Address 0x4029000 is in a rw- anonymous segment
==43335==  Uninitialised value was created by a heap allocation
==43335==    at 0x593229: chpl_je_malloc (jemalloc.c:1650)
==43335==    by 0x4058CE: chpl_malloc (chpl-mem-impl.h:53)
==43335==    by 0x405988: chpl_mem_allocMany (chpl-mem.h:74)
==43335==    by 0x4059ED: chpl_mem_alloc (chpl-mem.h:83)
==43335==    by 0x48704C: chpl_here_alloc (LocaleModelHelpMem.chpl:53)
==43335==    by 0x418205: copyRemoteBuffer (String.chpl:115)
==43335==    by 0x420341: chpl___ASSIGN_4 (String.chpl:1410)
==43335==    by 0x4C802C: _write_text_internal_chpl (IO.chpl:2832)
==43335==    by 0x4D2B03: on_fn_chpl23 (IO.chpl:4002)
==43335==    by 0x4D882C: wrapon_fn_chpl23 (IO.chpl:3998)
==43335==    by 0x4F55F4: fork_wrapper (in /data/cf/chapel/eronagha/chapel/hello4-datapar-dist_real)
==43335==    by 0x4F22EC: thread_begin (in /data/cf/chapel/eronagha/chapel/hello4-datapar-dist_real)

It's seems possible that this is from using jemalloc instead of the system allocator, though I'm not sure

Still, hello.chpl running clealy seems like a good sign. I think the TODOs to get this working better are:

  • [ ] See if we can use CHPL_MEM=cstdlib with CHPL_COMM_SUBSTRATE=smp. smp is really just emulating multiple nodes on a single machine with shared memory, so I'm wondering if we can get away without registering and use cstdlib. That should avoid the crazy long shutdown times and avoid the issue with jemalloc >=5 not supporting valgrind.
  • [ ] hello4 gets some errors. It's not obvious if this is from using jemalloc, or if this is a real issue

UDP:

quickstart+gasnet-udp seems to work, but there is some noise that makes investigating difficult (at least I'm assuming it's noise and not a real issue since we don't see the same problem with smp):

Here's an example of what we see:

./util/printchplenv --anonymize

CHPL_TARGET_PLATFORM: linux64
CHPL_TARGET_COMPILER: gnu
CHPL_TARGET_ARCH: none *
CHPL_LOCALE_MODEL: flat
CHPL_COMM: gasnet *
  CHPL_COMM_SUBSTRATE: udp
  CHPL_GASNET_SEGMENT: everything
CHPL_TASKS: fifo *
CHPL_LAUNCHER: amudprun
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: cstdlib *
CHPL_ATOMICS: intrinsics
  CHPL_NETWORK_ATOMICS: none
CHPL_GMP: none *
CHPL_HWLOC: none
CHPL_REGEXP: re2
CHPL_AUX_FILESYS: none

chpl test/release/examples/hello.chpl -g

valgrind --tool=memcheck -q --trace-children=yes --suppressions=$GASNET_SUPPRESS --track-origins=yes ./hello -nl2
==31246== Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
==31246==    at 0x4E45253: __sendto_nocancel (in /lib64/libpthread-2.22.so)
==31246==    by 0x52D450: sendPacket(amudp_ep*, amudp_msg_t*, unsigned long, sockaddr_in, packet_type) [clone .isra.5] (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x530212: AMUDP_RequestGeneric(amudp_category_t, amudp_ep*, unsigned int, unsigned char, void*, int, unsigned long, int, __va_list_tag*, unsigned char, unsigned char) [clone .constprop.7] (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x530685: AMUDP_RequestIVA (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x4B0DAA: gasnetc_AMRequestMediumM (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x527C27: gasneti_auxseg_attach (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x4B07A1: gasnetc_attach (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x48697E: chpl_comm_init (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x47EE60: chpl_rt_init (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x405438: main (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==  Address 0x5eb016c is 76 bytes inside a block of size 136 alloc'd
==31246==    at 0x4C29110: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==31246==    by 0x527FC0: _AMUDP_malloc (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x528A69: AMUDP_AcquireBuffer (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x52FEAE: AMUDP_RequestGeneric(amudp_category_t, amudp_ep*, unsigned int, unsigned char, void*, int, unsigned long, int, __va_list_tag*, unsigned char, unsigned char) [clone .constprop.7] (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x530685: AMUDP_RequestIVA (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x4B0DAA: gasnetc_AMRequestMediumM (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x527C27: gasneti_auxseg_attach (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x4B07A1: gasnetc_attach (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x48697E: chpl_comm_init (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x47EE60: chpl_rt_init (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x405438: main (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==  Uninitialised value was created by a heap allocation
==31246==    at 0x4C29110: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==31246==    by 0x527FC0: _AMUDP_malloc (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x528A69: AMUDP_AcquireBuffer (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x52FEAE: AMUDP_RequestGeneric(amudp_category_t, amudp_ep*, unsigned int, unsigned char, void*, int, unsigned long, int, __va_list_tag*, unsigned char, unsigned char) [clone .constprop.7] (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x530685: AMUDP_RequestIVA (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x4B0DAA: gasnetc_AMRequestMediumM (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x527C27: gasneti_auxseg_attach (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x4B07A1: gasnetc_attach (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x48697E: chpl_comm_init (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x47EE60: chpl_rt_init (in /data/cf/chapel/eronagha/chapel/hello_real)
==31246==    by 0x405438: main (in /data/cf/chapel/eronagha/chapel/hello_real)
...230 other lines

With an experimental patch from the gasnet team, we see fewer issues:

--- a/other/amudp/amudp_ep.cpp
+++ b/other/amudp/amudp_ep.cpp
@@ -179,7 +179,7 @@ extern amudp_buf_t *AMUDP_AcquireBuffer(ep_t ep, size_t
sz) {
      bh = pool->free;
      pool->free = bh->next;
    } else {
-    bh = (amudp_bufferheader_t *)AMUDP_malloc(sizeof(amudp_bufferheader_t) + poolsz);
+    bh = (amudp_bufferheader_t *)AMUDP_calloc(1,sizeof(amudp_bufferheader_t) + poolsz);
    }
    bh->pool = pool;
==44039== Thread 3:
==44039== Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
==44039==    at 0x4E45273: ??? (in /lib64/libpthread-2.22.so)
==44039==    by 0x52D450: sendPacket(amudp_ep*, amudp_msg_t*, unsigned long, sockaddr_in, packet_type) [clone .isra.5] (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x530212: AMUDP_RequestGeneric(amudp_category_t, amudp_ep*, unsigned int, unsigned char, void*, int, unsigned long, int, __va_list_tag*, unsigned char, unsigned char) [clone .constprop.7] (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x530685: AMUDP_RequestIVA (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x4B0DAA: gasnetc_AMRequestMediumM (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x4873C1: chpl_comm_broadcast_private (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x456FB6: chpl__init_Sort (Sort.chpl:188)
==44039==    by 0x431F3E: chpl__init_ChapelArray (ChapelArray.chpl:161)
==44039==    by 0x42719F: chpl__init_DefaultRectangular (DefaultRectangular.chpl:22)
==44039==    by 0x45057A: chpl__init_LocaleModelHelpSetup (LocaleModelHelpSetup.chpl:30)
==44039==    by 0x44FE34: chpl__init_LocaleModelHelpFlat (LocaleModelHelpFlat.chpl:20)
==44039==    by 0x41EB4F: chpl__init_LocaleModel (LocaleModel.chpl:29)
==44039==  Address 0x5eb0a88 is 88 bytes inside a block of size 136 alloc'd
==44039==    at 0x4C2B240: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==44039==    by 0x528341: _AMUDP_calloc (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x528A6E: AMUDP_AcquireBuffer (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x52D074: AMUDP_DrainNetwork(amudp_ep*) (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x52FB69: AM_Poll (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x4B0A10: gasnetc_AMPoll (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x527D6E: gasneti_auxseg_attach (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x4B07A1: gasnetc_attach (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x48697E: chpl_comm_init (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x47EE60: chpl_rt_init (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==    by 0x405438: main (in /data/cf/chapel/eronagha/chapel/hello_real)
==44039==  Uninitialised value was created by a stack allocation
==44039==    at 0x456EE1: chpl__init_Sort (Sort.c:2)
...40 other lines

but it's still not clean. The un-initialized value sees to be coming from generated chapel code, so it's possible this is an actual error we're seeing.

BTR Bug

All 4 comments

I see valgrind's point, though its complaint is a little different for me.

==4760== Thread 3:
==4760== Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
==4760==    at 0x4E4AAD3: ??? (syscall-template.S:84)
==4760==    by 0x528D90: sendPacket(amudp_ep*, amudp_msg_t*, unsigned long, sockaddr_in, packet_type) [clone .isra.5] (in /home/fortytwo/src/valgrind-gasnet-udp/hello_real)
==4760==    by 0x52B947: AMUDP_RequestGeneric(amudp_category_t, amudp_ep*, unsigned int, unsigned char, void*, int, unsigned long, int, __va_list_tag*, unsigned char, unsigned char) [clone .constprop.7] (in /home/fortytwo/src/valgrind-gasnet-udp/hello_real)
==4760==    by 0x52BEA5: AMUDP_RequestIVA (in /home/fortytwo/src/valgrind-gasnet-udp/hello_real)
==4760==    by 0x4A9169: gasnetc_AMRequestMediumM (in /home/fortytwo/src/valgrind-gasnet-udp/hello_real)
==4760==    by 0x482B88: chpl_comm_broadcast_private (comm-gasnet.c:987)
==4760==    by 0x452317: chpl__init_Sort (Sort.c:26)
==4760==    by 0x42FC5E: chpl__init_ChapelArray (ChapelArray.c:28)
...
==4760==  Address 0x5ee6258 is 88 bytes inside a block of size 136 alloc'd
==4760==    at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==4760==    by 0x523C41: _AMUDP_calloc (in /home/fortytwo/src/valgrind-gasnet-udp/hello_real)
==4760==    by 0x5243F2: AMUDP_AcquireBuffer (in /home/fortytwo/src/valgrind-gasnet-udp/hello_real)
==4760==    by 0x52B5D4: AMUDP_RequestGeneric(amudp_category_t, amudp_ep*, ...
...
==4760==  Uninitialised value was created by a stack allocation
==4760==    at 0x452364: init_chpl12 (Sort.c:50)

savec/Sort.c

/* Sort.chpl:179 */
static void chpl__init_Sort(int64_t _ln_chpl,
                            int32_t _fn_chpl) {
  _ref_int32_t refIndentLevel_chpl = NULL;
  _ref_DefaultComparator _ref_tmp__chpl = NULL;
  ReverseComparator_DefaultComparator_chpl ret_chpl;
  DefaultComparator_chpl comparator_chpl;
  ReverseComparator_DefaultComparator_chpl _this_chpl;
  ref_DefaultComparator _ref_tmp__chpl2 = NULL;
  if (chpl__init_Sort_p) {
    goto _exit_chpl__init_Sort_chpl;
  }
  printModuleInit("%*s\n", "Sort", INT64(4));
  refIndentLevel_chpl = &moduleInitLevel;
  *(refIndentLevel_chpl) += INT64(1);
  chpl__init_Sort_p = UINT8(true);
  chpl_addModule("Sort", ((c_fn_ptr)(chpl__deinit_Sort)));
  _ref_tmp__chpl = &defaultComparator_chpl;
  init_chpl11(_ref_tmp__chpl);
  chpl_comm_broadcast_private(INT64(38), sizeof(DefaultComparator_chpl), -1);
  _ref_tmp__chpl2 = &comparator_chpl;
  init_chpl11(_ref_tmp__chpl2);

  init_chpl12(&_this_chpl, &comparator_chpl);
  ret_chpl = _this_chpl;
  reverseComparator_chpl = ret_chpl;
  chpl_comm_broadcast_private(INT64(39), sizeof(ReverseComparator_DefaultComparator_chpl), -1);
  *(refIndentLevel_chpl) -= INT64(1);
  _exit_chpl__init_Sort_chpl:;
  return;
}

/* Sort.chpl:813 */
static void init_chpl11(_ref_DefaultComparator this_chpl) {
  return;
}

/* Sort.chpl:813 */
static void init_chpl10(_ref_DefaultComparator this_chpl,
                        _ref_DefaultComparator other_chpl) {
  return;
}

/* Sort.chpl:856 */
static void init_chpl12(_ref_ReverseComparator_DefaultComparator this_chpl,
                        _ref_DefaultComparator comparator_chpl) {
  DefaultComparator_chpl tmp_chpl;
  _ref_DefaultComparator _ref_tmp__chpl = NULL;
  _ref_tmp__chpl = &tmp_chpl;
  init_chpl10(_ref_tmp__chpl, comparator_chpl);
  (this_chpl)->comparator_chpl = tmp_chpl;
  return;
}

The last line of init_chpl12 sets _this_chpl from the
uninitialized tmp_chpl. Then _this_chpl is copied into the
global reverseComparator_chpl.

I'm guessing init_chpl10() and init_chpl11() don't do anything
because ReverseComparator(DefaultComparator)'s only member is a
DefaultComparator, which doesn't have any members?

chpl__header.h's implementation of
ReverseComparator(DefaultComparator) matches the first part,

typedef struct chpl_ReverseComparator_DefaultComparator_chpl_s {
  DefaultComparator_chpl comparator_chpl;
} ReverseComparator_DefaultComparator_chpl;

But its DefaultComparator does have a field,

typedef struct chpl_DefaultComparator_chpl_s {
  uint8_t dummyFieldToAvoidWarning;
} DefaultComparator_chpl;

which despite its name is what I'm guessing is triggering the valgrind
warning.

I don't get a warning about defaultComparator_chpl, I guess
because my chpl__init_Sort() takes its address and calls a noop
init_chpl11() on it, but doesn't store an uninitialized stack
variable to it. So chpl__header.h's definition as

static DefaultComparator_chpl defaultComparator_chpl;

keeps its well-defined initialized value. But my
chpl__init_Sort() does explicitly copy an uninitialized value over
reverseComparator_chpl's well-defined initialized value.

I suspect that many of the reported problems are due to padding. (And that there are more problems than any one run reports...)

On a lark, I tried just removing the dummy field. gcc has no
complaints, and it gets rid of the valgrind warning due to
uninitialized stack memory in Sort.c

diff --git a/compiler/codegen/type.cpp b/compiler/codegen/type.cpp
index b2c00c1..9b35dbb 100644
--- a/compiler/codegen/type.cpp
+++ b/compiler/codegen/type.cpp
@@ -302,7 +302,7 @@ void AggregateType::codegenDef() {
           fprintf(outfile, "union {\n");
       } else if (this->fields.length == 0) {
         // TODO: remove and enforce at least 1 element in a union
-        fprintf(outfile, "uint8_t dummyFieldToAvoidWarning;\n");
+        //fprintf(outfile, "uint8_t dummyFieldToAvoidWarning;\n");
       }

       if (this->fields.length != 0) {

But it just replaces it with another valgrind warning about
uninitialized stack memory. I suspect valgrind is just reporting the
first use of uninitialized memory (by each thread?) (in this buffer?).
As I make changes, valgrind reports a different location for the
uninitialized memory, so I suspect it's just reporting the first thing
it sees.

But the first error it reports is consistent, until I change it. So I
think what I'm changing is letting it get on to the next problem. So
what I'm changing must be what it's complaining about.

And in each case (besides the dummyFieldectomy) what I'm changing is
memseting a struct to 0, where it appears to me that the code is
initializing each field of the struct. So the only difference should
be the compiler-added padding. So presumably what's happening is
valgrind notes that padding isn't initialized, but is ending up passed
to a sendto(), and being transmitted.

First warning:

==29643==  Uninitialised value was created by a stack allocation
==29643==    at 0x4328B0: _newDomain (ChapelArray.c:64)

Making this change (compiled with --no-munge-user-idents):

--- a/ChapelArray.c
+++ b/ChapelArray.c
@@ -59,6 +59,35 @@ static void _newArray(DefaultRectangularArr_1_int64_t_F_localesSignal_int64_t va
   return;
 }

+/*
+(gdb) ptype  _domain_DefaultRectangularDom_1_int64_t_F
+type = struct chpl__domain_DefaultRectangularDom_1_int64_t_F_s {
+    int64_t _pid;
+    chpl____wide_DefaultRectangularDom_1_int64_t_F _instance;
+    chpl_bool _unowned;                        // 1 byte
+    // 7 bytes of padding
+}
+
+(gdb) ptype chpl____wide_DefaultRectangularDom_1_int64_t_F
+type = struct chpl_chpl____wide_DefaultRectangularDom_1_int64_t_F_s {
+    chpl_localeID_t locale;                    // 4 bytes
+                                               // 4 bytes of padding
+    DefaultRectangularDom_1_int64_t_F addr;    // 8 byte pointer
+}
+*/
+
+static void init_padding_1(_domain_DefaultRectangularDom_1_int64_t_F *s) {
+  char *p;
+
+  p = (char*)&s->_unowned;
+  p++;
+  memset(p, 0, 7);
+
+  p = (char*)&s->_instance.locale;
+  p += 4;
+  memset(p, 0, 4);
+}
+
 /* ChapelArray.chpl:352 */
 static void _newDomain(DefaultRectangularDom_1_int64_t_F value,
                        _domain_DefaultRectangularDom_1_int64_t_F * _retArg) {
@@ -68,6 +97,10 @@ static void _newDomain(DefaultRectangularDom_1_int64_t_F valu
e,
   chpl____wide_DefaultRectangularDom_1_int64_t_F T2 = {CHPL_LOCALEID_T_INIT, NU
LL};
   chpl____wide__ref__wide__domain_DefaultRectangularDom_1_int64_t_F chpl_macro_
tmp_662;
   chpl____wide_DefaultRectangularDom_1_int64_t_F chpl_macro_tmp_663;
+
+  init_padding_1(&initTemp);
+  init_padding_1(&call_tmp2);
+
   chpl_macro_tmp_662.locale = chpl_gen_getLocaleID();
   chpl_macro_tmp_662.addr = &initTemp;
   T = chpl_macro_tmp_662;

Now the first warning is

==30356==  Uninitialised value was created by a stack allocation
==30356==    at 0x43599A: init7 (ChapelArray.c:939)

Initialize more padding:

diff --git a/ChapelArray.c b/ChapelArray.c
index a976d32..64daef3 100644
--- a/ChapelArray.c
+++ b/ChapelArray.c
@@ -932,6 +932,20 @@ static void getIndices(_domain_DefaultRectangularDom_1_int64_t_F * this6,
   return;
 }

+/*
+(gdb) ptype chpl____wide_DefaultRectangularArr_1_int64_t_F_locale_int64_t
+type = struct chpl_chpl____wide_DefaultRectangularArr_1_int64_t_F_locale_int64_t_s {
+    chpl_localeID_t locale; // same 4 byte locale
+                            // 4 bytes of padding
+    DefaultRectangularArr_1_int64_t_F_locale_int64_t addr; // 8 byte ptr
+}
+ */
+static void init_padding_2(chpl____wide_DefaultRectangularArr_1_int64_t_F_locale_int64_t *p) {
+  char *s = (char*)&p->locale;
+  s += 4;
+  memset(s, 0, 4);
+}
+
 /* ChapelArray.chpl:2141 */
 static void init7(_array_DefaultRectangularArr_1_int64_t_F_locale_int64_t * this6,
                   int64_t _pid,
@@ -939,6 +953,7 @@ static void init7(_array_DefaultRectangularArr_1_int64_t_F_locale_int64_t * this
                   chpl_bool _unowned) {
   chpl____wide_DefaultRectangularArr_1_int64_t_F_locale_int64_t T = {CHPL_LOCALEID_T_INIT, NULL};
   chpl____wide_DefaultRectangularArr_1_int64_t_F_locale_int64_t chpl_macro_tmp_705;
+  init_padding_2(&chpl_macro_tmp_705);
   (this6)->_pid = _pid;
   chpl_macro_tmp_705.locale = chpl_gen_getLocaleID();
   chpl_macro_tmp_705.addr = _instance;

Changes the first warning to:

==30850==  Uninitialised value was created by a stack allocation
==30850==    at 0x432786: _newArray2 (ChapelArray.c:42)

Once more unto the breach:

diff --git a/ChapelArray.c b/ChapelArray.c
index 64daef3..4489de2 100644
--- a/ChapelArray.c
+++ b/ChapelArray.c
@@ -37,11 +37,27 @@ static void chpl__deinit_ChapelArray(void) {
   return;
 }

+/* This struct has the same layout as the one in init_padding_1 */
+static void init_padding_3(_array_DefaultRectangularArr_1_int64_t_F_locale_int64_t *s) {
+  char *p;
+
+  p = (char*)&s->_unowned;
+  p++;
+  memset(p, 0, 7);
+
+  p = (char*)&s->_instance.locale;
+  p += 4;
+  memset(p, 0, 4);
+}
+
+
 /* ChapelArray.chpl:336 */
 static void _newArray2(DefaultRectangularArr_1_int64_t_F_locale_int64_t value,
                        _array_DefaultRectangularArr_1_int64_t_F_locale_int64_t * _retArg) {
   _array_DefaultRectangularArr_1_int64_t_F_locale_int64_t call_tmp2;
   _array_DefaultRectangularArr_1_int64_t_F_locale_int64_t initTemp;
+  init_padding_3(&call_tmp2);
+  init_padding_3(&initTemp);
   init7(&initTemp, INT64(-1), value, UINT8(false));
   call_tmp2 = initTemp;
   *(_retArg) = call_tmp2;

Now:

==31392==  Uninitialised value was created by a stack allocation
==31392==    at 0x451150: helpSetupRootLocaleFlat (LocaleModelHelpSetup.c:612)

That function's got over a hundred lines of stack variable
declarations. Your turn. :-P

But in each of those cases, initializing the C-level padding quiets
valgrind's complaint. And in each case, the complaint of the use of
the uninitialized memory is in sendPacket->sendto(). So it looks
like valgrind's complaints are where the uninitialized part of the
struct is being put "on the wire".

I'm not a Valgrind user myself, but given that Chapel routinely sends struct padding on the wire (which presumably won't change anytime soon), wouldn't it be simpler to just add a valgrind suppression rule that ignores uninitialized values from any source passed by AMUDP to sendto()? This of course might result in missing a rare bug that truly sent garbage values in a meaningful field on the wire, but given the socket boundary is also the inter-process line where Valgrind's analysis loses track of information, it doesn't seem like you really lose that much.

We're willing to accept patches to the valgrind suppression file shipped with GASNet (that we do not carefully maintain), or you can of course create your own.

I was able to use valgrind + quickstart + gasnet (udp) with a suppression file that @vasslitvinov sent to me

  {
     socketcall-sendto
     Memcheck:Param
     socketcall.sendto(msg)
     obj:*
     fun:*
  }

If that is saved to say $HOME/my.supp then one can run a program with valgrind --trace-children=yes --suppressions=$HOME/my.supp.

Note that start_test -valgrindexe under gasnet uses a suppressions file from the GASNet project that is focused on preventing spurious memory leaks (but that's not what we use valgrind for in our testing). I would imagine that we could update it to use the above suppression instead without too much trouble.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Rapiz1 picture Rapiz1  路  3Comments

buddha314 picture buddha314  路  3Comments

bradcray picture bradcray  路  3Comments

ben-albrecht picture ben-albrecht  路  3Comments

vasslitvinov picture vasslitvinov  路  3Comments