Usually, we configured the ADIOS1 plugin with on-the-fly blosc compression in our *.cfg files along the lines of
# ADIOS params
TBG_adios_agg="0"
TBG_adios_ost="32"
TBG_adios_transport_params="'stripe_count=4;stripe_size=1048576;block_size=1048576'"
TBG_adios_compression="'blosc:threshold=2048,shuffle=bit,lvl=1,threads=10,compressor=zstd'"
TBG_adios_additional_params="--adios.aggregators !TBG_adios_agg \
--adios.ost !TBG_adios_ost \
--adios.transport-params !TBG_adios_transport_params \
--adios.compression !TBG_adios_compression \
--adios.disable-meta 0"
# Dump simulation data (fields and particles) to ADIOS files.
TBG_adios="--adios.period 500 --adios.file simData --adios.source 'species_all,fields_all' !TBG_adios_additional_params"
and used !TBG_adios in TBG_plugins for simulation data output.
I know that we do not have to configure aggregators, OSTs, and transport params anymore with ADIOS2, but how to configure compression with blosc via the json string for the openPMD plugin?
The manual gives an example only for compression with bzip2, but this results in errors for large datasets, see #3506.
cc'ing @franzpoeschel, @psychocoderHPC, @PrometheusPi
These are the parameter for checkpointing. Take care you should escape the " with \"
--checkpoint.file checkpoint_blosc --checkpoint.openPMD.file checkpointAdios
--checkpoint.period 50
--checkpoint.openPMD.json '{ "adios2": {"dataset": {"operators": [{"type": "blosc","parameters": {"clevel": "7"}}]} }}'
If you like to use blosc you should have installed ADIOS2 2.7.1
Update: I am still trying to test the above. But the installation of ADIOS2 2.7.1 and openPMD 0.13.2 is still work in progress on our home cluster...
Since ADIOS2 is now installed on hemera and seems to work, what are your findings @steindev ?
So, I started testing compression and, of course, ran in to problems :smile:
I let a simulation run on one node (4 GPUs) of the hemera cluster at HZDR with three different configurations of the openPMD based checkpoint plugin.
bash
TBG_ADIOS2_configuration_Base="'{ \
\"adios2\": { \
\"engine\": { \
\"type\": \"file\" \
, \"parameters\": { \
\"BufferGrowthFactor\": \"1.1\" \
, \"InitialBufferSize\": \"36GB\" \
, \"AggregatorRatio\" : \"1\" \
} \
} \
} \
}'"
bash
TBG_ADIOS2_configuration_compression="'{ \
\"adios2\": { \
\"dataset\": { \
\"operators\": [ { \
\"type\": \"blosc\" \
, \"parameters\": { \"clevel\": \"7\" } \
} ] \
} \
, \"engine\": { \
\"type\": \"file\" \
, \"parameters\": { \
\"BufferGrowthFactor\": \"1.1\" \
, \"InitialBufferSize\": \"36GB\" \
, \"AggregatorRatio\" : \"1\" \
} \
} \
} \
}'"
bash
TBG_ADIOS2_configuration="'{ \
\"adios2\": { \
\"dataset\": { \
\"operators\": [ { \
\"type\": \"blosc\", \
\"parameters\": { \
\"clevel\": \"1\" \
, \"doshuffle\": \"BLOSC_BITSHUFFLE\" \
} \
} ] \
} \
, \"engine\": { \
\"type\": \"file\" \
, \"parameters\": { \
\"BufferGrowthFactor\": \"1.1\" \
, \"InitialBufferSize\": \"36GB\" \
, \"AggregatorRatio\" : \"1\" \
} \
} \
} \
}'"
The base case (1) produces a checkpoint of 134GB size within 3sec 93msec
The time includes the calculation of the timestep.
A reference simulation without output calculates the timestep in 3sec 3msec.
Case (2) produces a checkpoint of 77GB size within 3sec 65msec.
Case (3) fails :unamused:
Following output in stderr
[gv013:56992:0:56992] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x2ab31494c049)
[gv013:56991:0:56991] Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x2ab31494c049)
==== backtrace (tid: 56992) ====
0 0x000000000004d455 ucs_debug_print_backtrace() ???:0
1 0x000000000000ad0b blosclz_compress() :0
2 0x0000000000007786 blosc_c() blosc.c:0
3 0x0000000000009009 do_job() blosc.c:0
4 0x000000000000a0cb blosc_compress() ???:0
5 0x0000000000464c99 adios2::core::compress::CompressBlosc::Compress() ???:0
6 0x00000000003e636b adios2::format::BPOperation::SetDataDefault<float>() ???:0
7 0x000000000029a93a adios2::format::BPSerializer::PutOperationPayloadInBuffer<float>() ???:0
8 0x000000000036bdd3 adios2::format::BP4Serializer::PutVariablePayload<float>() ???:0
9 0x0000000000245dc4 adios2::core::engine::BP4Writer::PutSyncCommon<float>() ???:0
10 0x000000000024a037 adios2::core::engine::BP4Writer::PerformPutCommon<float>() ???:0
11 0x00000000002417ac adios2::core::engine::BP4Writer::PerformPuts() ???:0
12 0x0000000000073b5b adios2::Engine::PerformPuts() ???:0
13 0x0000000000176f60 openPMD::detail::BufferedActions::flush() ???:0
14 0x000000000017726c openPMD::ADIOS2IOHandlerImpl::flush() ???:0
15 0x00000000001772cd openPMD::ADIOS2IOHandler::flush() ???:0
16 0x00000000000d9599 openPMD::Series::flushFileBased() ???:0
17 0x00000000000d98f3 openPMD::Series::flush_impl() ???:0
18 0x00000000000d991e openPMD::Series::flush() ???:0
19 0x00000000007f9929 picongpu::openPMD::openPMDWriter::write() ???:0
20 0x0000000000801a94 picongpu::openPMD::openPMDWriter::dumpData() ???:0
21 0x0000000000801f63 picongpu::Checkpoint::checkpoint() ???:0
22 0x000000000073438e pmacc::SimulationHelper<3u>::dumpOneStep() ???:0
23 0x00000000007e213e pmacc::SimulationHelper<3u>::startSimulation() ???:0
24 0x00000000007e30a0 picongpu::SimulationStarter<picongpu::InitialiserController, picongpu::PluginController, picongpu::Simulation>::start() ???:0
25 0x00000000006beb41 (anonymous namespace)::runSimulation() ???:0
26 0x00000000006b460d main() ???:0
27 0x0000000000022555 __libc_start_main() ???:0
28 0x00000000006b5d6f _start() ???:0
=================================
Any thoughts on this, @psychocoderHPC, @franzpoeschel?
@steindev We need to compile our own openPMD, blosc and adios with debug symbols enabled, to see where the data dump is crashing.
@franzpoeschel Just by chance, do you have an install script available that I could use as a starting point?
Not for blosc, but for Adios and Openpmd, this here could be a starting point. Just remove the clutter you don't need. @steindev
@steindev Could you try if your simulation is also crashing if you use zstd as the compressor. Another thing is that you used "level": "1" for the failing run and "level": "7" for the run which is not failing.
@steindev Could you try if your simulation is also crashing if you use
zstdas the compressor.
What to put in the json string to configure zstd as the standard compressor for blosc?
And do I need to install zstd? Is it different from zlib?
You need to add the parameter "compressor" : "zstd"
TBG_ADIOS2_configuration="'{ \
\"adios2\": { \
\"dataset\": { \
\"operators\": [ { \
\"type\": \"blosc\", \
\"parameters\": { \
\"clevel\": \"1\" \
, \"compressor\": \"zstd\" \
, \"doshuffle\": \"BLOSC_BITSHUFFLE\" \
} \
} ] \
} \
, \"engine\": { \
\"type\": \"file\" \
, \"parameters\": { \
\"BufferGrowthFactor\": \"1.1\" \
, \"InitialBufferSize\": \"36GB\" \
, \"AggregatorRatio\" : \"1\" \
} \
} \
} \
}'"
Allowed compressors are: blosclz (default), lz4, lz4hc, snappy, zlib, or zstd
Finally, this issue will be closed. The above configuration posted by @psychocoderHPC works :confetti_ball:
Please note the following (for future reference):
zstd is shipped with blosc code and Linux kernels released since 2017 such that there should be no need to install it\"InitialBufferSize\": \"36GB\", should be chosen a little larger than the gpu memory in order to avoid resizing of the buffer during data transfer from device to host to diskzstd with "clevel": "1" and "doshuffle": "BLOSC_BITSHUFFLE" produces smaller files than the standard blosclz with clevel": "7"EDIT 2021-05-18: See discussion below.
@steindev zstd should IMO not part of the linux standard, libz is shipped mostly.
Never the less blosc is compiling by default a zstd version shipped together with the blosc code.
@steindev zstd should IMO not part of the linux standard, libz is shipped mostly.
Never the less blosc is compiling by default a zstd version shipped together with the blosc code.
Hmm, (https://en.wikipedia.org/wiki/Zstandard#Usage) and linked references say it comes with the kernel since 2017.
So older systems will definitely use the version shipped with blosc.
I will edit the note.
Most helpful comment
Finally, this issue will be closed. The above configuration posted by @psychocoderHPC works :confetti_ball:
Please note the following (for future reference):
zstdis shipped with blosc code and Linux kernels released since 2017 such that there should be no need to install it\"InitialBufferSize\": \"36GB\", should be chosen a little larger than the gpu memory in order to avoid resizing of the buffer during data transfer from device to host to diskzstdwith"clevel": "1"and"doshuffle": "BLOSC_BITSHUFFLE"produces smaller files than the standardblosclzwithclevel": "7"EDIT 2021-05-18: See discussion below.