[ADD] fs_attachment_s3_migration: enable S3 migration #534

Aldeigja · 2025-11-15T13:47:30Z

Helper module that migrates existing attachments to S3.

Module for migration existing filestore attachments into the S3 backend with queue_job orchestration

ivs-cetmix · 2025-11-15T14:07:14Z

fs_attachment_s3_migration/models/ir_attachment.py

+            ("res_field", "=", False),
+            ("res_field", "!=", False),


Why do we need such domain leaf? Cannot it be simply omitted?

Good catch! This is a required workaround for Odoo's ir.attachment._search method
which automatically adds ('res_field', '=', False) to any domain that doesn't
already contain 'res_field'. Without this tautology, only attachments where
res_field=False would be found, missing all field-linked attachments.

I'll add an explicit comment explaining this.

ivs-cetmix · 2025-11-15T14:09:33Z

fs_attachment_s3_migration/models/ir_attachment.py

+        if not checksum:
+            return None
+
+        rf_domain = ["|", ("res_field", "=", False), ("res_field", "!=", False)]


Same question about domain here. If this is needed on purpose, then there should be an explicit comment explaining the reason behind it.

Same reason as above.

ivs-cetmix · 2025-11-15T14:13:20Z

fs_attachment_s3_migration/readme/DESCRIPTION.md

+This module lets users move existing `ir.attachment` files from the
+standard filestore or database into an Amazon S3-backed `fs.storage`, using a
+ wizard directly on the storage form.
+


Migrations are run in background batches, skipping attachments that are already stored in S3 or must remain in PostgreSQL.
This allows to run the process repeatedly avoiding creating duplicates.

ivs-cetmix · 2025-11-15T14:15:03Z

fs_attachment_s3_migration/readme/CONTRIBUTORS.md

+* Cetmix (cetmix.com)
+* Ivan Sokolov
+* George Smirnov


Cetmix (cetmix.com)

Ivan Sokolov

George Smirnov

ivs-cetmix · 2025-11-15T14:16:55Z

fs_attachment_s3_migration/data/queue_job_channel_data.xml

This setting is neither documented nor explained, however it's a very important one.

Good point. I'll add comment a comment in the XML explaining the channel purpose.

ivs-cetmix · 2025-11-15T14:21:02Z

fs_attachment_s3_migration/wizard/migration_wizard.py

+    storage_id = fields.Many2one("fs.storage", required=True)
+    storage_code = fields.Char(required=True)
+    batch_size = fields.Integer(default=500)
+    channel = fields.Char(default="root.s3_migration")


I think this is not a good idea to hardcorde it this way, especially takin into account the fact that you are using a data file to add it.
You should assign the default values using a function that populates the env.ref("fs_attachment_s3_migration.queue_channel_s3_migration") value.

ivs-cetmix · 2025-11-15T14:22:57Z

fs_attachment_s3_migration/wizard/migration_wizard.py

+        default=0,
+    )
+
+    @api.onchange("storage_id")


I would recommend to avoid using @onchange and use compared stored fields with readonly=False instead.
This will simplify the usage as part of automation including external calls.

ivs-cetmix · 2025-11-15T14:25:37Z

fs_attachment_s3_migration/models/ir_attachment.py

+
+        _logger.info(
+            "Completed batch migration: %d/%d attachments to storage %s (%d skipped). "
+            "Old files will be cleaned by garbage collection.",


by the garbage collector

ivs-cetmix · 2025-11-15T14:33:29Z

fs_attachment_s3_migration/models/ir_attachment.py

+            max_batches or "unlimited",
+        )
+
+        while True:


I still have some concerns regarding this approach.

Pros

Can run multiple jobs in parallel -> faster

Cons

Jobs are created instantly using the current DB snapshot. So as soon as the migration is started no further changes are taken into account.
Eg if some attachments were removed they will be processed anyway even if this is not needed any longer.
When run in parallel several batches can access the same physical files (same checksum). If we modify the data in one of those batches the modification will not be saved in the database until the transaction is committed. So we either need to commit these changes explicitly (questionable) or prepare the batches checksum-grouped so attachments with the same checksum are always in the same batch.

Conclusion

Personally I would prefer to have those batches enqueued one after one to avoid potential issues and also always have an up-to date database snapshot being used. However I may be missing something here, so let's research this part better.

ivs-cetmix · 2025-11-15T14:38:17Z

fs_attachment_s3_migration/models/ir_attachment.py

+
+            # Avoid empty writes if source is temporarily unreadable
+            if attachment.file_size and not file_data:
+                resolved = self._s3_resolve_migration_bytes(attachment, storage_code)


This leads to a query being run N times which significantly degrades the performance especially on databases with large number of attachments.
I would suggest to fetch the data for the entire batch (or even the entire db 🤔 ) using read_group and then iterate it.

[ADD] fs_attachment_s3_migration: enable S3 migration

db56d6d

Module for migration existing filestore attachments into the S3 backend with queue_job orchestration

Aldeigja marked this pull request as draft November 15, 2025 14:05

ivs-cetmix requested changes Nov 15, 2025

View reviewed changes

[IMP] fs_attachment_s3_migration: changes requested in review

ec02e7a

Aldeigja force-pushed the 16.0-t5048-fs_attachment_s3_migration-add-module branch from 2611c77 to ec02e7a Compare November 29, 2025 00:19

Uh oh!

[ADD] fs_attachment_s3_migration: enable S3 migration #534

Are you sure you want to change the base?

[ADD] fs_attachment_s3_migration: enable S3 migration #534

Uh oh!

Conversation

Aldeigja commented Nov 15, 2025 • edited by ivs-cetmix Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Pros

Cons

Conclusion

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Aldeigja commented Nov 15, 2025 •

edited by ivs-cetmix

Loading