Whenever a table row contains a Time value of "00:00:00" the application crashes with a "free(): invalid next size (fast)" error.
Read Time columns from MySQL/MariaDB that contain values set at midnight (00:00:00).
Normal fetching of the rows without the application crashing.
Application crashes with a "free(): invalid next size (fast)".
No other details shows even if RUST_BACKTRACE=1.
Read a MySQL table with Time columns (not DateTime) that contain "00:00:00" values.
The error happens even if only one of the rows contains a midnight value for Time.
I don't believe that this is related to #1130 since the values are converted correctly to NaiveTime.
In fact, I've even implemented a "solution" similar to the one mentioned by @frol in that ticket, adapted to NaiveTime, but it didn't make any difference.
Through lots of tests and debugging sessions I've been able to narrow the problem down to the deallocation (drop) of the BindData associated with the connection.
At the point of the deallocation the data in the Binds structure (the last processed row) could be one with a midnight (00:00:00) value or it could have a non-midnight value; the error happens in both cases. As long as there was a row with a midnight value processed at some point through the StatementIterator the problem will appear.
Below is a slightly edited call stack at the point when the application crashes (VS Code debugger). Specifically, the error happens when deallocating the first Time column in the bind data (there are two).
__GI_raise (@raise:45)
__GI_abort (@abort:66)
__libc_message (@__libc_message:180)
malloc_printerr (@7ffff73b547c..7ffff73b5503:3)
_int_free (@_int_free:201)
dealloc (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/alloc.rs:103)
dealloc (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/alloc.rs:174)
dealloc_buffer<u8,alloc::alloc::Global> (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/raw_vec.rs:709)
drop<u8,alloc::alloc::Global> (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/raw_vec.rs:719)
drop_in_place<alloc::raw_vec::RawVec<u8, alloc::alloc::Global>> (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr/mod.rs:174)
drop_in_place<alloc::vec::Vec<u8>> (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr/mod.rs:174)
drop_in_place<diesel::mysql::connection::bind::BindData> (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr/mod.rs:174)
drop_in_place<[diesel::mysql::connection::bind::BindData]> (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr/mod.rs:174)
drop<diesel::mysql::connection::bind::BindData> (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/liballoc/vec.rs:2320)
drop_in_place<alloc::vec::Vec<diesel::mysql::connection::bind::BindData>> (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr/mod.rs:174)
drop_in_place<diesel::mysql::connection::bind::Binds> (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr/mod.rs:174)
drop_in_place<diesel::mysql::connection::stmt::iterator::StatementIterator> (/home/.../.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ptr/mod.rs:174)
map<closure-0,mnemodia_common::models::week_slot::AccWeekSlot> (/home/.../.cargo/registry/src/github.com-1ecc6299db9ec823/diesel-1.4.4/src/mysql/connection/stmt/iterator.rs:37)
query_by_index<...> (/home/.../.cargo/registry/src/github.com-1ecc6299db9ec823/diesel-1.4.4/src/mysql/connection/mod.rs:77)
query_by_index<...> (/home/.../.cargo/registry/src/github.com-1ecc6299db9ec823/diesel-1.4.4/src/r2d2.rs:134)
At the point of raising the exception I can see that at _int_free (@_int_free:201) the nextsize variable has a value of "93825001331200".
This is probably the cause of the specific error message. In a "normal" deallocation this variable has more reasonable values.
If there are no midnight values in any row the deallocation happens without a problem.
Furthermore, I couldn't see any obvious issues while fetching each row in the method below (/.../diesel-1.4.4/src/mysql/connection/stmt/iterator.rs). binds is populated as expected, as far as I can see.
fn populate_row_buffers(stmt: &Statement, binds: &mut Binds) -> QueryResult<Option<()>> {
let next_row_result = unsafe { ffi::mysql_stmt_fetch(stmt.stmt.as_ptr()) };
match next_row_result as libc::c_uint {
ffi::MYSQL_NO_DATA => Ok(None),
ffi::MYSQL_DATA_TRUNCATED => binds.populate_dynamic_buffers(stmt).map(Some),
0 => {
binds.update_buffer_lengths();
Ok(Some(()))
}
_error => stmt.did_an_error_occur().map(Some),
}
}
I'm not sure if this error is easy to replicate on a different system/DB/table. I'm wondering if it hasn't been reported before because it's not common, or because Time columns aren't used that much (probably DateTime is more common).
Thanks & regards,
Alfonso
So I had a quick dive into this and will note my current results below:
So my local system is ubuntu 20.04 with libmysqlclient version 8.0.19-0ubuntu5 while the CI us using ubuntu 16.04 with libmysqlclient version 5.7.29-0ubuntu0.16.04.1. Out of curiosity I've tested started a docker container of 18.04 and also run the tests successfully there (libmysqlclient version 5.7.29-0ubuntu0.18.04.1), while it failed using 19.10 (libmysqlclient version 5.7.29-0ubuntu0.16.04.1).
So that does not say what is causing this, just that it could be possibly related to the version of libmysqlclient.
Thanks @weiznich! That was a quick turn around!
It seems like a big difference in behaviour for such similar versions (5.7.29-0ubuntu0.16.04.1 vs 5.7.29-0ubuntu0.18.04.1). I've had a quick look and they both seem to pack the same version of the upstream code: mysql-5.7.29 from 18-Dec-2019. Which makes it even more puzzling.
I'm wondering if the version of support libraries (libc6, liggcc1, etc.) could be making all the difference. Not sure if it's easy for you to do a quick test.
Cheers
Ok, seems like I've copied the wrong version for 19.04 🤦 The right one is 8.0.19-0ubuntu0.19.10.3, so this is probably related to same changes regarding to libmysqlclient between 5.8 and 8.0.
Ah, that makes more sense now. ;-)
Those two versions use version 8.0 of the connector, which I believe is a single C++ code base for C and C++ clients.
Would be interesting to know which component exactly contains the issue. Is it caused by diesel or is it a bug in mysqlclient? The next thing I would try is building everything with asan enabled or trying to run the tests inside of valgrind. Any help here would be nice.
So I've tried both asan and valgrind. Compiling and running the corresponding test with asan does make the error vanish 😳 .
valgrind emits the following output while the test succeeds…
==443951== Thread 2 mysql::types::da:
==443951== Invalid write of size 8
==443951== at 0x48B8468: ??? (in /usr/lib/x86_64-linux-gnu/libmysqlclient.so.21.1.19)
==443951== by 0x48A378C: mysql_stmt_fetch (in /usr/lib/x86_64-linux-gnu/libmysqlclient.so.21.1.19)
==443951== by 0x31C048: diesel::mysql::connection::stmt::iterator::populate_row_buffers (iterator.rs:133)
==443951== by 0x31BBED: diesel::mysql::connection::stmt::iterator::StatementIterator::next (iterator.rs:36)
==443951== by 0x347415: diesel::mysql::connection::stmt::iterator::StatementIterator::map (iterator.rs:29)
==443951== by 0x2F3B5F: <diesel::mysql::connection::MysqlConnection as diesel::connection::Connection>::query_by_index (mod.rs:77)
==443951== by 0x340D06: <T as diesel::query_dsl::load_dsl::LoadQuery<Conn,U>>::internal_load (load_dsl.rs:30)
==443951== by 0x340DF3: diesel::query_dsl::RunQueryDsl::load (mod.rs:1190)
==443951== by 0x3413D1: diesel::query_dsl::RunQueryDsl::get_result (mod.rs:1239)
==443951== by 0x30B40B: diesel::mysql::types::date_and_time::tests::times_of_day_decode_correctly (date_and_time.rs:227)
==443951== by 0x2C92B9: diesel::mysql::types::date_and_time::tests::times_of_day_decode_correctly::{{closure}} (date_and_time.rs:223)
==443951== by 0x2EB21D: core::ops::function::FnOnce::call_once (function.rs:232)
==443951== Address 0x5970e78 is 0 bytes after a block of size 40 alloc'd
==443951== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==443951== by 0x568F3B: alloc::alloc::alloc_zeroed (alloc.rs:162)
==443951== by 0x569091: <alloc::alloc::Global as core::alloc::AllocRef>::alloc_zeroed (alloc.rs:189)
==443951== by 0x5695B4: alloc::raw_vec::RawVec<T,A>::allocate_in (raw_vec.rs:88)
==443951== by 0x452723: alloc::raw_vec::RawVec<T>::with_capacity_zeroed (raw_vec.rs:149)
==443951== by 0x40B9E5: <u8 as alloc::vec::SpecFromElem>::from_elem (vec.rs:1750)
==443951== by 0x411DAD: alloc::vec::from_elem (vec.rs:1730)
==443951== by 0x333B64: diesel::mysql::connection::bind::BindData::for_output::{{closure}} (bind.rs:123)
==443951== by 0x320964: core::option::Option<T>::map (option.rs:456)
==443951== by 0x32F18D: diesel::mysql::connection::bind::BindData::for_output (bind.rs:120)
==443951== by 0x2EAC35: core::ops::function::FnMut::call_mut (function.rs:154)
==443951== by 0x2C7B81: core::iter::adapters::map_fold::{{closure}} (mod.rs:772)``
Another think I've checked is: Did the definition of MYSQL_TIME changed between 5.8 and 8.0, but it seems to be the same struct for both version.
Thanks @weiznich!
That's interesting. While debugging my error I noticed that the length member of the MYSQL_BIND structure for the Time columns was set to 48, while the length of the buffer (vector) was 40.
I thought that could be due to some alignment artefact, but the fact that the length member is supposed to be set by the _mysqlclient library_ made me a bit suspicious.
However, I've been doing some (unrelated) refactoring of my code and have started using BoxedQuery for this particular query... and when I've tried to replicate the error to add some details to this comment it's not happening any more. (?!?!?)
I guess the memory problem is somewhat subtle and is affected by any changes introduced by BoxedQuery, but perhaps this provides additional clues.
I'll see if I can go back to my previous version and do some more testing, but it might not be until tomorrow now.
I spoke too soon. :-(
As mentioned above, after changing to BoxedQuery the app stopped crashing with the same data that I was testing with before. However, when I changed to a different set of rows (still including a mix of columns with and without midnight values) the app crashed again - this time with "free(): invalid pointer".
Unfortunately, it seems that the context of the row processing and later deallocation has an significant impact on the present and exact nature of the problem.
That's interesting. While debugging my error I noticed that the length member of the MYSQL_BIND structure for the Time columns was set to 48, while the length of the buffer (vector) was 40.
I thought that could be due to some alignment artefact, but the fact that the length member is supposed to be set by the mysqlclient library made me a bit suspicious.
As this seems to be very suspicious I've digged again into the source code of mysql_time.h and it's horrifying…
Compare the definition from the central mysql github repository with the corresponding definition from the mariadb repository. Someone added a int time_zone_displacement in the mysql definition, making both incompatible. Additionally git blame reveals that this happens with the 8.0.19 release. I've not found any change log entry anywhere about this…
What now happens is the following:
diesel uses static bindings generated for an older mysqlclient versionMYSQL_TIME that has only a size of 40 bytes.diesel then uses this definition here to calculate the correct binding buffer size for the query result.libmysqlclient versions expect that the buffer has a size of 48 byte and write past the bounds of the buffer. While doing that some probably unrelated thing gets overridden. That then causes the crash in a with an unrelated error message about free.That all written comes up the question how to fix this mess. As the new field is in the end of the struct and the c standard does not allow struct field reordering it should be possible to just update our bindings to the new version. That means someone needs to run bindgen on mysql.h on different platforms and submit a PR to the mysqlclient-sys repo linked above. After that we can issue a new release that hopefully fixes this issue for newer versions of libmysqlclient. Older versions should continue to work as we now just allocate a larger buffer for them. They will only populate all but the last field, but as we initialize the buffer to be zeroed we can assume that the field will be then set to zero, which seems to be a sensible default for time_zone_displacement. Beside of that we may want to update the time handling code in diesel/mysql to handle that field as well, but that is something that could be done later.
As I'm not using mysql myself and also do not have access to a windows PC I will not work on this issue any time soon. Therefore I mark this issue as help wanted, so for anyone interested feel free to submit a corresponding fix.
I only ever use Linux, but I can give it a try on that environment. If the tests are successful on Linux then we'd just need somebody to do the Windows part - perhaps an easy change once we see all the differences between the two versions.
Ok, I think I made a bit of progress on the Linux front (I guess valid for MacOS too).
I've generated a new version of the bindings from version 8.0.19-0ubuntu5 of the library (libmysqlclient21). I wasn't quite sure which options to use, so I've looked at an old revision of mysqlclient_sys to replicate how it was being generated programmatically. This is what I've used:
bindgen --whitelist-function mysql.* --whitelist-type MYSQL.* --whitelist-var MYSQL.* -o bindings_macos.rs --default-enum-style=rust /usr/include/mysql/mysql.h -- -I/usr/include/mysql
I've applied the new file to the current version of mysqlclient_sys and then pointed a local version of diesel (master) at the modified repo. Initially it threw a few errors while building due to the change of my_bool to plain bool, plus some knock-on changes because now it is a proper boolean and before is was an int.
After those minor fixes I was able to compile my program and successfully run all the commands (or exacution paths) that it supports. This is not a full test by any means, but I don't think I can really run diesel's tests locally (?).
Note: Even though I'm normally using diesel 1.4.4, and I tried to apply the changes to that version initially, it started to come up with other errors that were not related to the update of the bindings (e.g. deprecation of Error::description()). In the end I went for the current master branch, which I guess makes sense in the long run.
So, I have a couple of patches, for diesel and mysqlclient_sys respectively, that might be worth testing a bit more "seriously". I could try and fork the repos and submit PRs or I can create a couple of patch files and send to anybody already doing PRs on a regular basis.
However, these changes are just for Linux/MacOS. Perhaps somebody can take the info about bindgen above and generate the Windows bindings (?).
Just a quick note about the build errors on the 1.4.x branch: Development normally happens on master. Those warnings/errors are fixed there. Changes are then backported if suitable to the other branch.
I've now tested my changes against libmysqlclient20 (libmysqlclient20_5.7.29-0ubuntu0.18.04.1) as well, and it seems to be working fine.
Again, this is a limited test, but it appears that the updated bindings are compatible with MySQL 5.7 and 8.0.
:+1: Feel free to open WIP PR's at the corresponding locations, so we can talk about specific changes there.
Hi @weiznich,
I've opened a PR for the mysqlclient-sys part of the fix: sgrif/mysqlclient-sys#24
My understanding is that the PR for diesel should wait until that change is committed to mysqlclient-sys and then we can update the dependency on diesel. Please let me know if you think there's a better approach.
As mentioned on the PR, I've run the diesel tests against both MySQL versions, 5.7 and 8.0, but that was linking to my local, updated, copy of mysqlclient-sys.
Thanks & regards,
Alfonso
I've pushed a fix not depending on a update of mysqlclient-sys as 1b0453f2cb814d39fbceb21807c5fb8c4ce9a6cc.
If someone is interested in seeing this fixed as part of the current 1.4.x release feel free to submit a backport PR based on this. (Keep in mind that the public FromSql/ToSql impls for mysqlclient-sys::MYSQL_TIME need to be preserved. I think they should internally relay on the impls based on the updated struct layout)
Happy to work on the 1.4.x backport (at least try).
@weiznich When you have a minute, could you explain what you mean with your comment about preserving FromSql/ToSql impls on the mysqlclient-sys version? As far as I can see the fix consist on creating and using a diesel-specific version of MYSQL_TIME that's compatible with newer libraries. So we are not touching or using the one from mysqlclient-sys, right?
@alfonso-eusebio The problem with just applying the same fix as in https://github.com/diesel-rs/diesel/commit/1b0453f2cb814d39fbceb21807c5fb8c4ce9a6cc is that we would remove all ToSql/FromSql for mysqlclient-sys::MYSQL_TIME impls generated here. Those are exposed as part of the public API. Removing them would be a breaking change. As workaround I propose the following:
MYSQL_TIME struct as done in the linked commitffi::MYSQL_TIME with that typeToSql/FromSql impls by internally calling that ones for our newly added MYSQL_TIME type in diesel. (So basically just calling diesel::MYSQL_TIME::into/diesel::MYSQL_TIME::from there)OK, the good news: I've managed to create a patch for branch 1.4.4 that "seems" to work.
The caveats:
#![allow(deprecated, invalid_value)] to diesel to get it to compile from my project (issues with deprecated Error::description() and the use of mem::uninitialized())FromSql/ToSql to use the new local version of MYSQL_TIME, but there are no other implementations that I can see.rust-toolchain I get a bunch of errors when running bin/test (or any other build/test command).I'd like to be able to properly set up my local repo clone and run a full set of tests on 1.4.4, and then with the patch.
Any ideas as to why I'm getting errors when trying to test branch 1.4.4? (rust-toolchain is set to 1.31.0)
Just a couple of examples of the first few errors:
error: expected identifier, found keyword `const`
--> /home/alfonso/.cargo/registry/src/github.com-1ecc6299db9ec823/smallvec-1.4.0/lib.rs:1779:16
|
1779 | unsafe impl<T, const N: usize> Array for [T; N] {
| ^^^^^ expected identifier, found keyword
...
error: expected `*` or `+`
--> /home/alfonso/.cargo/registry/src/github.com-1ecc6299db9ec823/object-0.19.0/src/endian.rs:578:34
|
578 | ($($struct_name:ident),+ $(,)?) => {
|
If I can manage to get the tests running then I can properly test the changes within diesel, rather that a simple case within my own app.
(Wondering as well why version 1.4.4 compiles fine inside my app when I get it from crates.io but throws errors when I get it from a local directory.)
Cheers
I've had to add #![allow(deprecated, invalid_value)] to diesel to get it to compile from my project (issues with deprecated Error::description() and the use of mem::uninitialized())
That's expected as we don't backport fixes for deprecation. It should be fine on 1.31, on newer rust versions the errors are generated as cargo does only suppress lint errors for dependencies from crates.io.
I haven't identified where or how to apply your point 3 above. I've modified FromSql/ToSql to use the new local version of MYSQL_TIME, but there are no other implementations that I can see.
We need to have both, a FromSql impl for the new local version of MYSQL_TIME to fix the bug, and a impl for the version from mysqlclient-sys to remain backward compatible to the old version. The second impl should just call the first one internally and do some type conversation. The same thing applies for ToSql.
I haven't been able to complete the cloning and testing of this repo following the contribution guidelines. Whatever version I set in rust-toolchain I get a bunch of errors when running bin/test (or any other build/test command).
It's a bit hard for backports to say what exactly needs to be done. We set a old version as default to verify that we support the same minimal rust version as before (basically that version that is set is the minimal supported version. Builds will break over time because our dependencies do not take MSRV as strict as we do and cargo pulls in newer versions by default. New rust versions break because we have #[deny(warnings)] somewhere and they will add new lints that generate new warnings. To see what is required to get it building on a certain rust version checkout the github actions ci workflow. (Otherwise it should also be fine to just use the CI there to check if everything is ok)
Wow, I am writing a CPP mysql client library, and this thread saved me days of frustration, thank you guys!