Moving WebKit to Git

by Jonathan Bedard (Apple) Slide Deck

A short outline of the steps we’re taking and processes that will change in the next few months as we move WebKit off of Subversion and onto GitHub.

Jonathan: This presentation will be about the timeline of moving WebKit to Git, along with the processes that come with it

Jonathan: First, why are we going with GitHub instead of another solution (namely, self-hosted GitLab)

Jonathan: GitHub is familiar for new contributors, familiar for web developers filing bugs, strong community developing supporting tools, 3rd party management of the repository (viewed as a plus by the maintainers)

Jonathan: This explains why we are going with GitHub instead of a self-hosted solution

Jonathan: Here is an overview of the timeline, it has changed since the last one that was shared because the git clone is taking longer than expected.

Jonathan: Whenever large changes are made, we will be transparent about it ahead of time and will make sure that it is shared with our contributors

Jonathan: Here are some processes that are going to be changing

Jonathan: Commit Identifiers: With svn revisions, we can know roughly where it is in time. We will be transitioning to commit identifiers, which reflect what branch the commit is on

Jonathan: These identifiers are basically a translation layer for humans (since 12 character git hashes don’t mean much to humans), our tools will still work with git hashes

Jonathan: Protected branches: This is something done by other projects on GitHub, and we are going to do it too.

Jonathan: Certain branches will receive additional protection, not everyone will be able to land changes.

Jonathan: There will need to be a set of repository administrators that can modify the set of protected branches, should be roughly the same people that already have ssh access to the severs hosting the current repositories.

Jonathan: Main (formerly known as master) will be protected

Jonathan: Commit queue will be mandatory for commits to protected branches

Jonathan: Since there are some changes that developers don’t want to wait for, we will add a “fast” mode to commit queue that respects “Unreviewed xxx fix” in lieu of a reviewer

Jonathan: With that in mind, why have commit queue at all?

Jonathan: We need it to handle some post-commit actions that are unsupported by GitHub

Jonathan: Pull-requests will be squash merged initially

Jonathan: This is because we believe EWS should act on each commit independently, we aren’t sure if we can handle that throughput at the moment. May change in the future.

Jonathan: What is happening with patch workflows?

Jonathan: We intend to replace patches with pull requests. However, the patch workflow will remain functional throughout the transition.

Jonathan: Patch workflow may remain indefinitely for security bugs due to gits way of reasoning about branches.

Jonathan: Trac doesn’t work with large git repositories, so we will be getting rid of it.

Jonathan: Existing trac links will redirect to GitHub by way of a redirect service (since github will not natively respect commit identifiers)

Jonathan: Bug Tracker will not be changed until the repository migration is complete, will be the last thing to change.

Jonathan: Before we consider pushing developers away from bugzilla, we will need to support new incoming issues on GitHub.

Jonathan: Bugzilla will remain for some use cases, namely security bugs

Jonathan: More migration plans will be communicated in the Spring after we finish the migration

Simon: Northern or Southern hemisphere spring?

Jonathan: Northern hemisphere, March/April

Ryosuke: If i look at the blame on GitHub there will be no identifier, how can I identify it?

Jonathan: That is why we want to add the commit identifiers in the commit messages

Ryosuke: But I want to see identifiers in git blame

Jonathan: GitHub doesn’t support this, the only solution would be to self-host the repo and figure out how to change the UI

Alexey: We don’t want to get rid of trac, we just don’t expect it to work well with GitHub. If we can find a way to keep it working, we’ll leave it, but it won’t support commit identifiers

Ryosuke: i want an interactive solution that works with source tree, otherwise this is only half functional

Ryosuke: The proposed solution requires me to follow 5 or 6 different links

Jonathan: The most practical solution may be a client-side plugin or extension. Given a hash it is trivial to convert it to an identifier, we just need something to do it.

Ryosuke: That would be an acceptable solution to me.

Maciej: I just tried git blame and it doesn’t show the identifier, just a short commit message and how many years ago to the side of the line.

Maciej: When you hover on the commit message, it shows the full commit message which should include our commit identifier, so this may not be an issue with the GitHub blame.

Sam: GitHub blame should include the date by default.

Maciej: Question/comment: What exactly does it mean to support bugzilla patch review and pull requests? Ultimately, things have to be pull requests or we support direct commit to protected branches. I wouldn’t want everyone to be able to direct commit just because they say they got a bugzilla review. It would be confusing to have two review systems.

Jonathan: Since we have a commit queue handling the landing of patches, we intend to take that away.

Maciej: We should have a plan to phase out the patch system, it doesn’t make sense to have two forms of patch review for the indefinite future. Maybe others have different opinion.

Aakash: If we keep security bugs in bugzilla, we need both forms indefinitely.

Jonathan: The concern is forcing people to think about both systems, not necessarily supporting code for both systems. Maybe we just disable patch review for non-security bugs?

Maciej: Once a thing has been r+d, we don’t have to keep it secret anymore. We want things to be hidden until we are ready to land it. GitHub private repos don’t give us the properties we want, we will probably have to talk to them about that. If security fixes have to be weird, those who deal with them habitually may have to just deal with that.

Jonathan: Losing EWS for security fixes would be undesirable.

Jen: Drupal went through similar considerations 10 years ago, we may be able to learn from them.

Simon: Do we also plan to remove the trac wiki? That is where we have been keeping contributor meetings notes.

Jonathan: We should move to the GitHub wiki, but we haven’t looked at it deeply enough to see what it would entail. We intend to migrate it somewhere, ideally GitHub.

Simon: We would keep trac running for wiki even if we use it for SVN?

Jonathan: Yes

Ryosuke: There was a discussion about fork vs non-fork for PR, do we have a resolution there?

Jonathan: We don’t have a resolution, it doesn’t make sense to forbid it. My sense is that being a committer lets you create unprotected branches, which is what GitHub access allows, but we will end up with a large number of branches. We’ll need to follow up to see if that bothers people.

Ryosuke: I hate seeing all the branches on large GitHub projects with incomprehensible names.

Jonathan: We will have some system to clean up branches.

Maciej: One approach to avoid superfluous branches could be for commit queue to delete PR branches immediately. If we need them to stick around for some time for reference, we could delete them after some delay.

Jonathan: Another approach is that we could have an old PRs header in our branches (i.e. old/<name>), so we can group them and hide them from searches. That would keep them around but won’t pollute everyone’s history

Christopher: Will the current git webkit mirror be replaced with new commits?

Jonathan: It will be replaced. It was originally http, tried changing it to https which didn’t work (and changed commits). I have a git clone of WebKit that I am adding commit identifiers to and updating committers. I will publish that as the canonical repository.

Michael: I wanted to mention that GitLab lets you host your projects. If you pay for a private fork of a project, it supports confidential merge requests that can target that private fork. It may not have the fine grained control that you are looking for.

Alexey: We need to talk to GitHub about their ideas for how to solve it. We don’t have a crazy demand that wouldn’t apply to anyone else.

Jonathan: I am working on talking with GitHub folks about this, but no updates to share.

Ryosuke: I really hate git. The only reason we should move to git is if we are going to use GitHub. If we aren’t moving there, it isn’t worth it.

Jonathan: We are slowly losing support for SVN on the platforms we develop for

Jonathan: There are also lots of CI systems that are having trouble with supporting SVN, or don’t support it at all

Alexey: Does anyone in this group have tools that we aren’t thinking about needing to update?

Don: No affected tools on the Sony side, but the changing git hashes will impact us.

Alexey: Can we do anything to help?

Don: We’re only finding out about it now, haven’t yet talked about it as a team. We would definitely prefer a pure git repo, especially because not all safari branches are mirrored in the git repo.

Sam: Git replace exists, which may allow you to keep the existing history

Jonathan: I’m not sure if git replace is sufficient, I’ll double check. One problem is that already have two sets of hashes because of the http / https clones.

Don: is the WebKit org going to be used for anything other than the WebKit repo?

Jonathan: I’m not sure

Jonathan: Tess, is the WebKit org going to be used for anything else (such as the feature status you discussed yesterday?)

Tess: Yes, that would make sense

Don: Are we going to use any existing GitHub integrations/apps (there is one to delete pull request branches). Are there any requirements for hosting our own tools around GitHub?

Jonathan: There is interest, but we haven’t made any plans as many of them don’t come in to play until we use pull requests and github issues. We also need to get clarification from GitHub about API requests that we need to consider when building our own automation. Bug I am absolutely interested in not rebuilding things that already exist.

Don: Ok cool, just wanted to make sure it was on the table.

Last modified 8 weeks ago Last modified on Jan 5, 2021 3:10:50 PM