Skip to content

services/galexie: add integration tests for S3 storage. #5749

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

overcat
Copy link
Contributor

@overcat overcat commented Jul 12, 2025

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've reviewed the changes in this PR and if I consider them worthwhile for being mentioned on release notes then I have updated the relevant CHANGELOG.md within the component folder structure. For example, if I changed horizon, then I updated (services/horizon/CHANGELOG.md. I add a new line item describing the change and reference to this PR. If I don't update a CHANGELOG, I acknowledge this PR's change may not be mentioned in future release notes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

What

This is the follow-up PR for #5748. Let's merge #5748 first, and then we'll handle this PR.

Why

See #5748

Known limitations

N/A

@overcat overcat force-pushed the feat-s3-support-dev branch from eb86f15 to 0bc05d1 Compare July 13, 2025 23:34
@overcat overcat marked this pull request as ready for review July 13, 2025 23:34
@urvisavla
Copy link
Contributor

@overcat we've been a bit busy with the Protocol 23 release but we'll review this soon. Thanks!

@overcat
Copy link
Contributor Author

overcat commented Jul 16, 2025

@overcat we've been a bit busy with the Protocol 23 release but we'll review this soon. Thanks!

It's okay, this PR is not urgent.

@urvisavla urvisavla requested review from tamirms and sreuland July 17, 2025 19:23
Copy link
Contributor

@urvisavla urvisavla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work integrating the S3 tests into the existing framework. The PR looks good.

- name: Pull LocalStack image (for S3)
if: ${{ matrix.storage_type == 'S3' }}
shell: bash
run: docker pull localstack/localstack:latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to use a fixed version. the latest tag is mutable and it's possible that there could be some updates which break our tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I fixed the tag. e8aa374

@tamirms
Copy link
Contributor

tamirms commented Jul 18, 2025

@overcat this looks great! could you add one more test case to the integration tests? I don't think we have any coverage which exercises the code path where we receive a precondition failed error upon trying to insert a file that already exists. I think the following test case should exercise that path:

  1. initialize empty bucket
  2. scan-and-fill --start 4 --end 10
  3. scan-and-fill --start 8 --end 15

In step (3) galexie should try to insert ledger files 8, 9, and 10. But, it should receive a precondition failed error from AWS and handle that error by skipping over to the next files

@overcat
Copy link
Contributor Author

overcat commented Jul 19, 2025

@overcat this looks great! could you add one more test case to the integration tests? I don't think we have any coverage which exercises the code path where we receive a precondition failed error upon trying to insert a file that already exists. I think the following test case should exercise that path:

  1. initialize empty bucket
  2. scan-and-fill --start 4 --end 10
  3. scan-and-fill --start 8 --end 15

In step (3) galexie should try to insert ledger files 8, 9, and 10. But, it should receive a precondition failed error from AWS and handle that error by skipping over to the next files

Hi @tamirms, I would like the existing TestAppend test to include this scenario. Theoretically, it is best if we can ensure that the object has not changed, but the current datastore interface does not provide this capability. If necessary, we can add a function to the datastore to return the object's identifier.

@tamirms
Copy link
Contributor

tamirms commented Jul 21, 2025

I would like the existing TestAppend test to include this scenario.

sure, that sounds good!

Theoretically, it is best if we can ensure that the object has not changed, but the current datastore interface does not provide this capability. If necessary, we can add a function to the datastore to return the object's identifier.

I'm not sure what would be the equivalent of that for GCS. Alternatively, you could return the last modified timestamp of the object. I believe that is available for both GCS and S3

@urvisavla
Copy link
Contributor

Theoretically, it is best if we can ensure that the object has not changed, but the current datastore interface does not provide this capability. If necessary, we can add a function to the datastore to return the object's identifier.

By this I'm assuming you mean that the object in the datastore is the same as the one we're attempting to upload. We could definitely check for that, but I'm not sure what we’d do with that information beyond just logging it, unless you're suggesting we go ahead and overwrite the object (upload without preconditions) in such a case.

@overcat
Copy link
Contributor Author

overcat commented Jul 22, 2025

Hey @tamirms, I think 6804884 might be what you're looking for. Let me know if I'm on the right track so I can get those unit tests for GetFileLastModified added! cc @urvisavla

@tamirms
Copy link
Contributor

tamirms commented Jul 22, 2025

Let me know if I'm on the right track so I can get those unit tests for GetFileLastModified added!

@overcat that looks good to me!

@overcat
Copy link
Contributor Author

overcat commented Jul 22, 2025

Thanks @tamirms, I have now added unit tests. 5e32083

BTW. there are many functions in the datastore that use HeadObject; I think we can refactor them, but let's put that in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants