Skip to content

Only migrate mirrors containing commits in migration v276 #27132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

lunny
Copy link
Member

@lunny lunny commented Sep 19, 2023

Add more conditions to avoid possible migration broken.

@lunny lunny added type/bug skip-changelog This PR is irrelevant for the (next) changelog, for example bug fixes for unreleased features. labels Sep 19, 2023
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Sep 19, 2023
@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Sep 19, 2023
@silverwind silverwind requested a review from KN4CK3R September 19, 2023 11:18
@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Sep 19, 2023
@delvh delvh changed the title Just find those mirrors which not empty and is ready when migrating Only migrate non-empty and ready mirrors in migration v276 Sep 19, 2023
@delvh delvh changed the title Only migrate non-empty and ready mirrors in migration v276 Only migrate mirrors containing commits in migration v276 Sep 19, 2023
@delvh delvh enabled auto-merge (squash) September 19, 2023 15:02
@puni9869 puni9869 added the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Sep 19, 2023
Copy link
Member

@KN4CK3R KN4CK3R left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking while testing

@GiteaBot GiteaBot added lgtm/blocked A maintainer has reservations with the PR and thus it cannot be merged and removed lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. labels Sep 19, 2023
@silverwind
Copy link
Member

silverwind commented Sep 19, 2023

@puni9869 don't put such premature reviewed/wait-merge please. Unless trivial, each PR should wait at least 24h before merge. The original PR that introduced the broken migration was also one of these cases. It was merged too fast, my review also was a "rubber stamp" kind of review there.

Copy link
Member

@KN4CK3R KN4CK3R left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does not fix a problem. The real error is, that the migration revealed that the database is/was inconsistent. See this discord discussion: https://discord.com/channels/322538954119184384/322910365237248000/1153443258014388254

@@ -126,6 +129,8 @@ func migratePushMirrors(x *xorm.Engine) error {
var mirrors []PushMirror
if err := sess.Select("push_mirror.id, push_mirror.repo_id, push_mirror.remote_name, push_mirror.remote_address, repository.owner_name as repo_owner, repository.name as repo_name").
Join("INNER", "repository", "repository.id = push_mirror.repo_id").
Where("repository.status = ?", repo_model.RepositoryReady).
And("repository.is_empty = ?", false).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. An empty repository can contain push mirrors. This would not update the RemoteAddress then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty repository should have no any git data. So this will avoid read git data here failure. Maybe we can update the remote_address when the first mirror successfully?

@@ -63,6 +64,8 @@ func migratePullMirrors(x *xorm.Engine) error {
var mirrors []Mirror
if err := sess.Select("mirror.id, mirror.repo_id, mirror.remote_address, repository.owner_name as repo_owner, repository.name as repo_name").
Join("INNER", "repository", "repository.id = mirror.repo_id").
Where("repository.status = ?", repo_model.RepositoryReady).
And("repository.is_empty = ?", false).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. If you migrate an empty repository, you still have a pull mirror in the database. This would not update the RemoteAddress then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my above comment, I think they should be updated when the first mirroring.

@@ -63,6 +64,8 @@ func migratePullMirrors(x *xorm.Engine) error {
var mirrors []Mirror
if err := sess.Select("mirror.id, mirror.repo_id, mirror.remote_address, repository.owner_name as repo_owner, repository.name as repo_name").
Join("INNER", "repository", "repository.id = mirror.repo_id").
Where("repository.status = ?", repo_model.RepositoryReady).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repositories with status = RepositoryPendingTransfer could have mirrors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for migrations, it may/may not have the mirrors. So we cannot assume it always have when the status = RepositoryPendingTransfer. We can only assume when status = RepositoryPendingReady, the git data have the remote.

@@ -126,6 +129,8 @@ func migratePushMirrors(x *xorm.Engine) error {
var mirrors []PushMirror
if err := sess.Select("push_mirror.id, push_mirror.repo_id, push_mirror.remote_name, push_mirror.remote_address, repository.owner_name as repo_owner, repository.name as repo_name").
Join("INNER", "repository", "repository.id = push_mirror.repo_id").
Where("repository.status = ?", repo_model.RepositoryReady).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repositories with status = RepositoryPendingTransfer could have mirrors.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Sep 20, 2023

Share my option again here: I do not think adding RemoteAddress is the right approach, especially it is a patch for the audit

Why it is not ideal? The definition of RemoteAddress is unclear. In git config, the URL could have username/password, but when storing it into database, it loses the credentials. Then RemoteAddress is not really useful for daily usage.

Then, does it really help the audit? I do not think so. One key discussion for the audit is: how to record necessary fields correctly. The root problem is that design of the audit system is not flexible, it doesn't have the ability to record changes which are not in model/database. Adding such "RemoteAddress" only bypasses the audit problem for the mirrors, but it doesn't resolve the problem fundamentally, I can guess and imagine there are/will be still many cases that the audit system doesn't support, it's not practicable to keep adding "FooAddress" / "BarAddress" to various tables in the future.

I would suggest to make the audit system flexible, and revert the "RemoteAddress" related changes until there is a clear design and usage.

@lunny lunny removed the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Sep 20, 2023
@KN4CK3R
Copy link
Member

KN4CK3R commented Sep 20, 2023

Why it is not ideal? The definition of RemoteAddress is unclear.

It's not unclear, we already present that informationen in the API.

Then RemoteAddress is not really useful for daily usage.

Without that field you have no performant way to create a "mirror overview" page for the admin dashboard. Could list constant failures and so on. While the source was the audit PR, it's usage is not limited to it. As shown in the changes, the UI uses it.

it's not practicable to keep adding "FooAddress" / "BarAddress" to various tables in the future.

You don't have add every info needed to the model. If only the audit system needs an information, we can simply create an audit type "inheriting" from the model.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Sep 20, 2023

Why it is not ideal? The definition of RemoteAddress is unclear.

It's not unclear, we already present that informationen in the API.

Unfortunately, the API design was unclear already. "Create" could take the user&password from RemoteAddress, but the RemoteAddress of "List/Get" doesn't have them. I agree that the API shouldn't expose sensitive data, but I think such naming is not a good design. I consider such "RemoteAddress" as an incomplete patch, because it is not a complete replacement for the "git config", maintaining username&password still requires git config.

Then RemoteAddress is not really useful for daily usage.

Without that field you have no performant way to create a "mirror overview" page for the admin dashboard. Could list constant failures and so on. While the source was the audit PR, it's usage is not limited to it. As shown in the changes, the UI uses it.

I agree, while it is just an incidental benefit brought by this change. With or without "RemoteAddress", there is no much difference at the moment, while I think the key point is how to make it complete (the prev topic) and the audit field design (the next topic).

it's not practicable to keep adding "FooAddress" / "BarAddress" to various tables in the future.

You don't have add every info needed to the model. If only the audit system needs an information, we can simply create an audit type "inheriting" from the model.

Maybe I do not have the complete understanding for the design. If you are sure it is good enough, I have no more question.

@puni9869
Copy link
Member

@puni9869 don't put such premature reviewed/wait-merge please. Unless trivial, each PR should wait at least 24h before merge. The original PR that introduced the broken migration was also one of these cases. It was merged too fast, my review also was a "rubber stamp" kind of review there.

I will keep in mind next time, No worries.

@lunny lunny added kind/bugfix and removed type/bug labels Oct 7, 2023
@delvh delvh added type/bug and removed kind/bugfix labels Oct 7, 2023
@lunny lunny closed this Jan 27, 2024
auto-merge was automatically disabled January 27, 2024 09:09

Pull request was closed

@lunny lunny deleted the lunny/fix_again branch January 27, 2024 09:09
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/blocked A maintainer has reservations with the PR and thus it cannot be merged skip-changelog This PR is irrelevant for the (next) changelog, for example bug fixes for unreleased features. type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants