Meson: Wrap Tooling Discussion

Created on 16 Aug 2018  路  33Comments  路  Source: mesonbuild/meson

Thoughts on wrap tooling

Currently it's a major pain in the caboose to maintain wraps. My
workflow usually looks something like the following:

  1. fork the upstream project I'm trying to wrap (this will make
    something like [email protected]:barcharcraz/project
  2. add meson build files and test within my fork of upstream
  3. create an issue on [https://github.com/mesonbuild/wrapweb]
    requesting a new repo for my wrap
  4. fork said repo and either copy or cherry-pick the meson build files
    out of my fork of upstream, also add upstream.wrap. This results in
    a repo like [email protected]:barcharcraz/project-1
  5. submit my pull request, without having tested that the wrap
    actually works when using a patch style wrap file from wrapweb
  6. go through various changes to the wrap for style and comments (also
    without having an easy way to test if each change builds)
  7. wrap accepted

When I wrap an upstream I usually go to fedora's (release
monitoring)[https://release-monitoring.org/] site and add myself to
get notifications. Release monitoring (aka anyta) scrapes project web
pages (and other sources) for new versions and sends messages to
maintainers. I think it's how the new hotness get's it's info.

When a new release comes out I usually update the wrap files (again
without really testing) and then open an issue on the mesonbuild/project repo,
when a new branch is created I create a merge request eventually it gets merged.

Problems with the current workflow.

It should be obvious that the above workflow isn't ideal, it's
possible that there exist tools that would help that I don't know
about, but from what I gather nobody's wrapping workflow is ideal, I
think this is a major issue in terms of wrapweb and wrapdb
adoption. It's also going to be an issue as meson gets better support
for binary wraps and windows.

For me wrap development in my fork of upstream is pretty smooth,
usually consisting of reading buildsystem files, running the build and
looking at compile options, and doing lots of commands like like find . -name "*.c" | ... | xclip -selection c. The process of requesting a
new wrap is also mostly fine, after all we're interested in having
each wrap be maintained. The extremely annoying part is testing the
patch-style wrap in the context of other projects, and in
transitioning from my fork of upstream to a wrapweb style repo (I
almost always screw some stylistic thing up).

Ideas.

I've got several ideas and possible projects that I'd love to hear thoughts on.

A wrapdevtools package

Fedora has (rpmdevtools)[https://pagure.io/rpmdevtools] which is
developed separately from rpm and yum/dnf as just some (somewhat
documented) scripts to assist development of packages. There's also
tito, but personally I've found rpmdevtools more useful. These tools
should probably arise as a sort of community effort between package
maintainers, each tool addressing an immediate need and standing on
it's own.

Some ideas include:

  • wrapdev-initwrap to download a package (maybe to package-cache),
    examine it for the filename and folder name, and make a wrap file
    with the hash of whatever was downloaded. (this could be useful for users as well)
  • wrapdev-testwrap to test using a wrap as a subproject
  • wrapdev-lint to lint wraps
  • wrapdev-updatehelper to download the current version and a new
    version (from upstream) and help the maintainer update. For example
    it could highlight any changes in the upstream's build files, as
    well as any added or removed source files

Some of these tools could be part of meson wrap as well, but I don't
really see any reason they need to be maintained as part of meson
itself. I would lean toward these tools not doing network access or
interacting with git, and leave that to the wrapweb maintainer tools
(that are currently in the wrapweb repo, but not super useful on their
own)

some clarification on [wrap-git]

[wrap-git] style wraps are somewhat confusing to newcomers (I
think). It's not clear from the documentation when you'd use them vs a
git submodule under subprojects.

New kinds of wraps.

Currently meson has three kinds of wraps, [wrap-git], [wrap-file]
and [wrap-file] with patches. The wrapdev-initwrap tool above
could be useful for users dealing with the first two types as
well. The first two kinds of wraps are pretty simple, and while some
new tools might be handy I don't think they cause much trouble. The
patch style wrap is annoying to use because you need to specify the
full url, filename, and hash for the patch, and since wrapweb
generates these patch-style wraps using it's repo and the
upstream.wrap there's no easy way to even generate the patch tarball
during development!

Perhaps we could allow specifying a directory (within subprojects)
instead of a patch_url, patch_filename, and patch_hash. This way
if you need to wrap a project you can make a project.wrap file, and a
folder with build definitions and test the wrap in the context of the
project that needs to consume the wrap. There are issues with how to
structure this though (do we copy the build files over to the source
tree every time they change, modify meson to allow build files to
pretend to be in a different source tree?). This would make it easy to
contribute the wrap files to wrapweb when they are working well
enough, and allowing this may make it easier to test the wrap with
different projects.

Local wrapweb servers

I think the local devel wrapdb server would have to behave quite
differently from the "real" one, so this probably isn't such a great
idea.

subprojectwraps supporRFC

All 33 comments

From a contributor perspective, the high-level workflow I would ideally like to follow when upstream uses git (ie the majority of time)

  • Fork upstream's git repository
  • Port to meson
  • Propose that the fork be adopted by the mesonbuild organization
  • Done :)

This is an intentionally naive description, but I see no reason why things cannot be kept simple like that.

From personal experience, I can confirm that the current workflow is detrimental to the wrap database growth :)

There is already a helper script for some of these in the wrapweb repo. There's a lot of stuff missing, though.

I guess it could be updated to be used something like this:

createwrap <release tag> <url of upstream tarball>

Which would then create the upstream wrap, and a diff from the release tag to the current commit. It could then be imported to the wrapweb Git repo.

The reason we don't just ship the entire upstream Git repo is that a) Github would probably not like that if we get popular and b) it is fundamentally insecure and hard to audit. Pointing to the original source tarball plus one with extra files is simple and easy to audit. Using plain Git would work for an "internal project" but not particularly well here because auditing that the repo someone is pushing is actually correct and valid and not backdoored with extra commits is a lot more work. This is a security risk.

a) Github would probably not like that if we get popular

Cannot really argue with that, but I imagine github's infrastructure would manage :) If this became a concern the repositories could easily be transferred elsewhere.

b) it is fundamentally insecure and hard to audit.

Programatically checking that the fork only adds new files on top of upstream's sources would be pretty trivial wouldn't it?

More generally, please bear in mind that I'm arguing from a potential contributor perspective, I think lowering the barrier to contribution should be a (if not the) priority.

The trouble with createwrap is that currently it seems to work with the meson github org, which contributers don't have access to. Also it needs documentation. It might be helpful to have some screen/session recordings of the meson org (@sarum9in) side of creating a new wrap.

I still kinda like the idea of being able to develop the patch files in their own directory, just because it also facilitates binary wraps and "hacky" wraps. And IMO it could make the process of "oh crud I need to wrap this library" -> "the wrap of this library I just made actually works for me, a real consuming project" more smooth.

I guess it should be noted that I'm not the BDFL (or anything really) of wrapweb. Other people do the actual decisions.

the repositories could easily be transferred elsewhere

This is weasel wording. What is "easily"? Where is "elsewhere"? Who would maintain it? Who would pay for it? How could we achieve the same service reliability as Github? By coding it ourselves? Who would implement that? Etc. Etc Etc.

would be pretty trivial wouldn't it

You would need to prove that the _entire repo_ is valid. A sample attack would go like this:

  • take upstream repo
  • add your backdoor in the commit history and rewrite it
  • add release tag pointing to a commit that has the backdoor
  • add only good commits on top of it
  • submit the result, somehow social engineering or with some other method cause the reviewer to miss the fact that the upstream release tag has a different hash than is in the submitted repo
  • 0wn everyone

Any verifier script that only runs on the submitted repo would fail, because it contains only build file additions.

This is not to say that it could not be made to work. But the point is that Git + commit hashes + other stuff is _inherently more complicated_ than a link to upstream tarball + build files on top. Things that are more complex are more likely to contain security vulnerabilities.

However if someone can come up with a workflow that makes contributing easier without complicating verification et al we should _totally_ do that. But note that the current model is taken directly from Debian. Most packages are maintained (AFAICR) in some sort of revision control but the end result is always upstream tarball + the patch file even today. I'm guessing this is due to the same auditability reasons as discussed above.

Could one have a 'canonical upstream git repository' and a 'meson wrap git repository' and then basically diff based on an upstream repository branch or tag? i.e. require that the meson wrap is rebased on top of an authoritative upstream branch/tag.

So most debian packages have a full mirror of the upstream source tree in debian's git repos (using gbp). fwiw Fedora's model is probably closer. Most SRPMs include the upstream distribution exactly (although it's mirrored with dist-git).

W.R.T. @MathieuDuponchelle's suggestions I think that could be done with tooling improvements, a tool could probably do most of steps 4 and 5 in my initial post. In fact createwrap already kinda does, but it needs polishing and documentation.

OK Here's a simple, concrete proposal that I think I can get to PoC in an afternoon or two:

Allow a wrap file such as the following:

[wrap-file]
directory=zlib-1.2.11
source_url=https://example.com/zlib-upstream/whatever.tar.gz
source_hash=00000000000000000000000000000000000000

patch_directory=zlib-meson

This would reduce friction at least for _my_ workflow, but it's a change in meson itself so I'd like @jpakkane's input.

Could one have a 'canonical upstream git repository' and a 'meson wrap git repository' and then basically diff based on an upstream repository branch or tag? i.e. require that the meson wrap is rebased on top of an authoritative upstream branch/tag.

Yes, none of this strikes me as a very difficult problem

This is weasel wording. What is "easily"? Where is "elsewhere"? Who would maintain it? Who would pay for it? How could we achieve the same service reliability as Github? By coding it ourselves? Who would implement that? Etc. Etc Etc.

I don't know how that is weasel wording, that is an answer to a very hypothetical "what if"

And if I might add, reliability isn't exactly a selling point of the current solution anyway is it?

The only real requirement I have from Meson side is that WrapDB must not, I repeat not, require the use of Git (or any other revision control system) for consuming packages. The decision on how data is stored behind the scenes is up to @sarum9in who is the "wrapdb master".

Any other comment made by me above should be considered purely advisory, feel free to ignore any or all of it if you come up with something better.

The only real requirement I have from Meson side is that WrapDB must not, I repeat not, require the use of Git (or any other revision control system) for consuming packages.

In the github context, a reasonable fallback could be to simply download the repo as a zip file instead of eg making a shallow clone if git is unavailable on the system, or if configured to do so?

The requirement is not "use something else if Git is not available". It is must not use Git at all!

The artifacts must be plain file archives (such as release tarballs). If possible they should be GPG signed. There is no support for signature verification in Meson yet, but at some point we probably should add that.

And just to be sure: this is only for stuff in WrapDB. People (and corporations and what have you) hosting their own things can use Git repos or submodules or whatever they want and that is totally cool. They can do that because their servers are privately hosted and only a very limited number of people have access to said repos. WrapDB is a public service so it needs to be a lot stricter.

The requirement is not "use something else if Git is not available". It is must not use Git at all!

Well, still not a problem

I realized that my desired workflow actually doesn't need any changes in meson. I'm writing some python helper scripts and examples now.

Let me drop few notes before I read this thread again about how I am using it and my thoughts.

  1. I acknowledge that current workflow is suboptimal, however fixing it is not trivial but I am slowly working towards it. The first step is to eliminate wrapdb server as a single storage of wraps and make it stateless. I have already moved all wraps to github. The idea is to split serving and creating paths. It is a blocker mostly because currently it is a single point of failure and it is really bad if we break it accidentally.
  2. Wrap tests: currently this is not enforced nor supported. This should not require any wrap server provided that the tool is maintained by meson developers who will make sure that it is in sync with how wrapdb works. I am talking about minimal projects that would act as integration tests for wraps. Such project would try to use wrap and run some primitive tests.
  3. Wraps are tested with ninja test provided that they have anything to run. Otherwise only basic verification is performed (meson.build is parseable). The tool for this is mesonwrap review, see cli.py.
  4. wrap review also supports --export_sources to dump sources with a patch somewhere to test it locally.
  5. There is a almost unknown http://github.com/mesonbuild-test which I created to test org commands without disrupting main org. Not sure if it is useful for other people.
  6. A word on @barcharcraz proposal for wrapdevtools: I would appreciate if we work on it in https://github.com/mesonbuild/wrapweb/tree/master/mesonwrap/tools, that way project will not divert. Also maybe some of tools you need already exist there. Feel free to update https://github.com/mesonbuild/wrapweb/issues/31 with ideas.
  7. I can't think of a way to improve wrap creation requests at the moment. I am trying to be responsive, but I can't always guarantee it obviously. The way I create wraps is roughly this:
$ cd ~/wrapdb
$ mesonwrap new_repo --homepage https://libgit2.org/ --version=0.27.4 --directory libgit2 libgit2

This command creates a new repository with readme, description, etc. It also creates a new branch for specified version. This command can be run only by mesonbuild admin.

I agree on the location for wrapdevtools work. I've put some preliminary tools for my workflow at http://github.com/barcharcraz/wrapdevtools. I'm going to write up my workflow using them and then we can work on getting them into mesonwrap itself.

Oh, by the way. The reason I made an issue in mesonbuild/meson instead of in wrapweb was just for visibility, seeing as I was fishing for input from other current or potential wrap maintainers.

@barcharcraz , nice! It's obviously a good thing to document a working process and put up tools to aid with the existing design :)

How would you propose facilitating the workflow I outlined?

For example, say I want to write meson build definitions for an active project, where upstream has no intention for a reason or another to adopt the meson build definitions.

As I want to keep my meson build definitions up to date, I have initially based them on upstream's master, then backported the patches to the latest stable release branch. I would like to solely maintain my fork, by occasionally rebasing, and potentially backporting more patches to the stable branch, and not have to maintain a separate repository containing only the build definitions (the "wrap" repository in meson's organization), as duplicating is bad and bound to create issues down the line.

Do you have suggestions on how to achieve that?

Tools that are currently implemented in our codebase are mostly intended for reviewing and accepting wraps, not creating them. So I think it's a good idea to have a separate set of tools specifically for that. I would probably prefer to have them in the mesonwrap repository to keep up to date with any internal changes, however their development should be community-driven as it represents needs of community, not meson developers, in the first place. So in the end it is probably up to @barcharcraz if you want to maintain them in a separate repository. For visibility we probably should have it under mesonbuild org though.

I think the workflow I used in the example would work OK with git, but I want to avoid having a tool where you need to pass huge numbers of command line parameters for stuff like the upstream repo, upstream branch, your repo, your branch, etc.

Right now the extractpatch looks at a wrapfile (generated by newwrap) to figure out what directory is the source and where the pristine tarball is. I suppose this could be extended to allow looking at a wrap-git repo. So it would automatically know the "true" upstream from the wrapfile and could look at the working tree (or index) to figure out what constitutes the patch. I guess I can take a swing at it.

The issue of backporting build fixes is more complex, I know I've made a wrap where I specified a meson version too old to actually build the wrap, it got caught with 0.47.0's new warnings but it was already commited. Wrapdb already has release versions.

I'm down to maintain them in mesonwrap, but docgen needs to be set up (do we use hotdoc, is hotdoc healthy, etc). Also they should probably be a separate pypi package, which may be awkward in the same repo.

Makes sense, let's create a separate repository then. wrapdevtools sounds like a good name. @jpakkane @barcharcraz any objections? I will give @barcharcraz write access to the repository. Also a side note that wrapdevtools is currently under GPL3 while the rest of meson is Apache2. I don't think it's a problem for a separate problem, but worth paying attention to.

No objections here. I鈥檓 willing to relicense to Apache2, but I picked GPL because I figured copyleft fit the project and it鈥檚 my default license. I鈥檓 not sure what the implications of gpl would be if meson wants to use code from weapdevtools, so I鈥檓 tempted to relicense

It's a good point. While there should be no dependency on wrapdevtools someone might want to copy or move some pieces of code later.

I've ported a couple of dependencies i've needed but never uploaded to wrapdb. I think having a tool that would allow me to do a wrapdb publish command would be great.

I would love if my workflow could be:

wraptool init <http-link-to-tarball> # creates a git project, downloads, unpacks tarball, creates correct .gitignore file, basically sets up dev environment...
... create meson.build files ...
git commit
wraptool publish

we're working on it!

because you're wrap is using the name of the project you're wrapping we do have a review process that's more through than what you may find on pypi or rubygems.

wrapdevtools is currently under GPL3 while the rest of meson is Apache2

This is not really nice because we may want to share code between wrapdevtools and Meson itself.

since I'm the only contributor right now I decided to re-license it

which should be done.

Thanks.

The repo should be good to go barring any more issues. Just ping me when it鈥檚 up (Demos on matrix, Demos[m] on irc

Was this page helpful?
0 / 5 - 0 ratings