utils::replace_all shows the way to pass std::string to Rust and get std::string back. #332 should simplify the process further. With that, we have an arsenal to port a whole lot of functions from util namespace:
consolidate_whitespace (#339)get_command_output (#358)run_command (#358)run_program (#358)resolve_tilde (#373)absolute_url (#371)strwidth (#397)strwidth_stfl (#402)to_u (#340)is_valid_color (#353)is_valid_attribute (#356)censor_url (#372)trim_end (#342)trim (#342)quote (#394)get_random_value (#348)quote_if_necessary (#394)get_auth_method (#517)is_special_url (#360)is_http_url (#354)is_query_url (#360)is_filter_url (#360)is_exec_url (#360)escape_url (#352)unescape_url (#352)make_title (#391)getcwd (#575)remove_soft_hyphens (#576)is_valid_podcast_type (#363)get_default_browser (#344)get_basename (#616)mkdir_parents (#618)strnaturalcmp (#629)quote_for_stfl (#661)gentabs (#662)extract_filter (#758)substr_with_width (#773)run_interactively (#799)translit (shouldn't be ported—see comment for details)convert_text (shouldn't be ported—see comment for details)I'd prefer Rust versions to have rustic APIs; e.g. std::string f(const std::string& a) should become f(a: &str) -> String, not f(a: &String) -> String. The intent behind this is to allow these functions to be easily used on the Rust side too, once we got more code there. But it's not the primary goal of this issue, and I'll merge anything that passes the tests :)
The way I ported utils::replace_all was:
I don't require you to follow the same process, I'm just sharing the experience.
At the time of writing, src/tuils.cpp only has 63% coverage. If the function you're porting is poorly tested, please increase its test coverage first. Or pick a different one if you're not in the mood to write tests; that's fine too!
Please submit one PR per function for ease of reviewing and to speed up merging. Mention this issue so I can update the list.
Might be a good idea to post a comment here whenever you take up or stop porting some function, to avoid duplicate work. If you stop because of some technical reason, please write down some details on that, so we can see if there are some common problems, or where help is needed.
Last but not least: we don't have any of this porting business figured out yet, so please raise questions and issues as you stumble upon roadblocks. We need questions before we can look for answers.
Do you want one function per PR?
Yes, please!
Please don't start any new work on this for now—#333 requires me to move Rust code around, which will create merge conflicts for you. I'll comment again when this issue is unblocked again.
@zaowen, apparently you started work on consolidate_whitespace. Please submit it when it's ready, and I'll rebase it myself.
This issue is unblocked now, just make sure you base your PRs on 45bc7c08de460f2d1f0a88f82d1eb9eaffc025e3 or later. (Doesn't apply to zaowen's upcoming consolidate_whitespace PR.)
@Minoru I am taking get_auth_method but I need to add curl_sys as dependency.
By the way, any plans to switch to rust 2018?
Great, thanks! Dependency looks good.
I intend to move to Rust 2018 eventually, but there are no set plans yet. I'm supporting 1.25+ because I want as many people as possible to be able to build Newsboat, and not everyone uses rustup and can obtain latest stable Rust. I haven't yet figured out a way to find out when I can bump the minimal supported version.
Maybe we can keep a CI for rust with the minimum version?
We do, both for macOS and for Linux. By "a way to find out when I can bump the minimal supported version" I meant "how do I know when users moved on from 1.25 to a newer version". I could look at rustc shipped with Debian stable, Fedora, and some other distros, but people sometime try to compile Newsboat on some really old stuff like CentOS 6, so I don't know how well this approach would work.
How should translit be ported while returning two strings?
Um, but translit returns only one string. In fact, I don't see any functions in utils that return a pair.
translit takes two arguments, but so does e.g. run_command--you can look at it for an example.
I am not sure what to do for iconv part. Do I still need to convert the encodings?
@pickfire Of course; that's the point of that function. Please check if libc crate provides bindings to libiconv, or if there are some other crates that does. With those bindings, it should be possible to rewrite the function to Rust.
@pickfire I took a look at crates.io as well, and it seems like there is no bindings available. Also, I suspect that once we moved all the code to Rust, we won't need translit and convert_text at all, because Rust String is always UTF-8—re-encoding will be done by libraries that we'll use for XML parsing and for Ncurses. So let's put those two aside for now.
@Minoru Any news on the rest of the functions that could be ported?
@pickfire All three are available, pick any one and submit a PR ;)
Are there any plans after those functions are ported? And is cxx something that could be used to port newsboat?
@Minoru Regarding extract_filter, do you think it would be good to have Rust return a tuple of &str? I recall that Rust may do some re-arrangement for struct, not sure for tuple.
pub unsafe extern "C" fn rs_extract_filter(line: *const c_char) -> (*mut c_char, *mut c_char) {
Not sure if that is possible. IIRC that was what I was stuck on.
@kpcyrd
Are there any plans after those functions are ported?
Sorry, nothing concrete yet. I need to find some time and pick the next module to port (I prefer to port small chunks myself, to make sure I understood all the dependencies, and to form the general idea of how to proceed).
And is cxx something that could be used to port newsboat?
It requires Rust 1.42+, which is way too new for Newsboat. Maybe in a year or so.
Apropos to that, did you see #709?
@pickfire
do you think it would be good to have Rust return a tuple of
&str?
The "pure Rust" version in rust/libnewsboat/src/utils.rs — yes, let it return the tuple.
The FFI version in rust/libnewsboat-ffi/src/utils.rs — no, because Rust tuples don't have a fixed layout, so you can't easily pass them over FFI. Using a struct might've made sense if this FFI call was actually used throughout the codebase, but since it'll be hidden behind wrappers in both C++ and Rust, I would just copy the current design.
The "pure Rust" version in rust/libnewsboat/src/utils.rs — yes, let it return the tuple.
The tuple makes it a bit in-transparent which one is which, because they don't have names.
Later on we could disambiguate them with types (e.g. make it (String, Uri)), but for now a struct would be better indeed.
I consider working on porting substr_with_width. As far as I can tell the point of the function is to limit the number of terminal "chars" needed to print the string while taking unicode and html tags into account.
Now am I right that I can just assume that the input is going to be UTF-8 encoded? So that I can just assume it is a valid &str and just use to_string_lossy in the C++ interface? I'm not sure because the C++ implementation uses utils::str2wstr on its input which I'm not sure what it actually does.
@rnestler IIRC the terminal only have characters with 1 or 2 (usually CJK characters or emojis) width, I guess you need to get all those characters within the width specified.
@rnestler, your summary of the function's purpose looks correct to me.
All strings inside Newsboat are assumed to be in UTF-8, but don't be too surprised when it turns out they actually aren't. We've been over this when rewriting the logger—turned out some LOG calls wrote binary garbage into the log :) That's one area where I think a panic is acceptable, because this would be a logical bug (an "impossible situation") which should be fixed.
std::string is an array of bytes (Vec<u8> in Rust), while std::wstring is an array of wchar_t (roughly Vec<char> in Rust). The C++ code has to use str2wstr/wstr2str in order to work on the string character-by-character; I guess you'll emulate that with str::chars(). Thus, no, you can't treat std::string as &str, but if you go through CStr like this, you should be fine.
A-a-and we're done here. All the utility functions with simple arguments are now ported. Thanks everyone for the hard work you've put in! :tada:
If you're hungry for more, check out the rust-port label.
We're now out of small self-contained functions, I'm afraid, so new issues will tend to be about whole C++ classes we have. Stay tuned!
Most helpful comment
A-a-and we're done here. All the utility functions with simple arguments are now ported. Thanks everyone for the hard work you've put in! :tada:
If you're hungry for more, check out the
rust-portlabel.We're now out of small self-contained functions, I'm afraid, so new issues will tend to be about whole C++ classes we have. Stay tuned!