-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Redo vendoring based on pypi.org/project/vendoring/ #49752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Redo vendoring based on pypi.org/project/vendoring/ #49752
Conversation
I might err towards an RFC for visibility. The change seems reasonable on the face of it; not going to review in detail obviously. |
b240115
to
6471996
Compare
It seems like Taskcluster is fundamentally unable to run this PR; it leaves comments such as 6471996#commitcomment-150981848:
|
6471996
to
3791a6c
Compare
With the RFC now merged, can we get this reviewed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably OK, but it's a bit harder to tell because there are multiple changes that all contain generated output. I understand the reason might be to associate each change in output with a specific change in the code, but when we add something in one commit and remove it in the next it gets a bit confusing. I think I'd prefer to see just the commits that make human-authored changes first, and then a single commit with all the tool-generated changes at the end.
tools/requirements_vendor.txt
Outdated
# This file was autogenerated by uv via the following command: | ||
# uv pip compile --universal --python-version 3.8 requirements_vendor.in -o requirements_vendor.txt | ||
|
||
# This file was then modified to remove all markers, as we want to make |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of this having undefined post-processing steps. Is it something you could write a short script for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! (And it's not undefined, it's defined in the comment there!)
|
||
[tool.vendoring.transformations] | ||
drop = [ | ||
"*.dist-info", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would not dropping these help with the pytest-asyncio situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to check, but I think this was mostly motivated by changes between pre-release and stable vendoring, and getting consistent output between the two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, getting consistent output with/without pradyunsg/vendoring@861e561.
We could definitely keep all the metadata, if we want to, which does avoid the pytest-asyncio
reference — though honestly explicitly including it probably isn't a bad idea, because it means we loudly and clearly fail if for some reason the plugin isn't present.
3791a6c
to
cf8ef15
Compare
I've split up the commits now; this does mean a6b4743 is a bogus commit that cannot pass CI, because it has updated all the paths for the vendored code being re-vendored, but doesn't actually have the re-vendoring. If we actually want to land it like this, I guess we need to make that commit be able to cope with either the old-style or new-style vendoring. And despite the branch name, I've actually added a commit to update everything, which I think we may well want to land separately still. |
A bunch of the Azure Pipelines failures are #51884. |
cf8ef15
to
21c9a9e
Compare
This allows us to move the third_party directory to being managed by Python tooling.
This moves us to relying on the vendoring tool to do our vendoring, rather than doing this on an ad-hoc basis. See: tools/pyproject.toml, tools/requirements_vendor.txt, tools/third_party_patches. This doesn't change what we have vendored, it merely moves it over to the new vendoring mechanism. Notably, this leaves us _without_ all the non-installed bits of these various third-party libraries: we no longer have docs or tests or GitHub Actions Workflows or anything else. This also means we only need to have third_party on the sys.path, and nothing else. This requires two changes to our actual code: * tools/wpt/android.py, because we've changed the path of tooltool (because that's how it is packaged), and * webdriver/tests/conftest.py, because we no longer have the *.egg-info (or *.dist-info) from pytest-asyncio, and thus must manually register the plugin entrypoint. For anyone wishing to audit this commit, the following may be helpful: git diff --raw HEAD^- | cut -d' ' -f5- | sort
Per e08f471 it appears it was intended to vendor html5lib==1.1, but instead the commit after 1.1 seems to have been vendored. Rather than vendoring a random revision from git, downgrade to the actual release, which is merely changes the version number. The removal of html5lib/LICENSE.md here is purely because vendoring is now looking purely at the wheel for the license, rather than the entire repository, thus it no longer picks up the license in benchmarks/data/wpt/LICENSE.md, as no part of that directory is in the distribution package.
Then, using uv, we can build requirements_vendor.txt and remove various unused dependencies. This isn't perfect, because we want to vendor everything regardless of markers, and thus this still requires manual fixup to remove all markers from the output. This allows us to remove a number of packages (attrs, certifi, importlib-metadata, more-itertools, pathlib2, and zipp), but does add a few extra (colorama, which gives pytest better colourized output on Windows; pyparsing, which packaging relies on to parse markers and requirements; and tomli, which pytest relies on to parse *.toml files on Python < 3.11).
21c9a9e
to
1a13808
Compare
And removed again, to be landed separately after this. |
|
If some of the breakage is things trying to use |
See individual commits here.
In short, this uses vendoring to re-do all of our Python vendoring, directly into third_party (crudely:
pip install -t tools/third_party ...
), and then moves to a generatedrequirements_vendor.txt
which finds several recursive dependencies we'd missed and makes it obvious we can drop others.The only version change here is we go from vendoring html5lib as of html5lib/html5lib-python@f4646e6 to the html5lib==1.1 release; the only change between that commit and the 1.1 release is going from
__version__ = "1.2-dev"
to__version__ = "1.1"
.This is a huge negative line diff primarily because we're no longer vendoring all the documentations and unit tests and CI workflows and all from everything we vendor; instead we're just vendoring the actual distribution packages, with their licenses.
I'll argue this doesn't require an RFC because we're not actually vendoring anything new here — we're merely changing how we do our pre-existing vendoring, and this should have no impact on any downstream consumers. (This is pedantically untrue: we've changed how we define
sys.path
which might break if others aren't relying ontools.localpaths
for this, and we've removed vendored packages we no longer use which others could rely on elsewhere.)Why? Because having all the documentation and everything is just a nuisance, and having a huge number of items added to
sys.path
is painful (as it makes it long and hard to understand what's going on, and makes us more dissimilar to how these packages expect to be installed). Having this automated also makes it much easier to update our vendored packages, or add further new vendored packages in future (as #49598 does).