Skip to content

Async machine translations#4043

Draft
jonbulz wants to merge 3 commits intodevelopfrom
enhancement/async-bulk-translations
Draft

Async machine translations#4043
jonbulz wants to merge 3 commits intodevelopfrom
enhancement/async-bulk-translations

Conversation

@jonbulz
Copy link
Copy Markdown
Contributor

@jonbulz jonbulz commented Nov 27, 2025

Short description

Add option for async machine translations (only for Deepl thus far)

Proposed changes

  • Add translate_asyn boolean flag to translate_queryset in MachineTranslationApiClient
  • Move translation request to Celery task if translate_async=True

Side effects

  • None

Faithfulness to issue description and design

There are no intended deviations from the issue and design.

How to test

Resolved issues

Fixes: #3852


Pull Request Review Guidelines

@jonbulz
Copy link
Copy Markdown
Contributor Author

jonbulz commented Dec 1, 2025

@PeterNerlich @hannaseithe I have played around a bit to find a way to move deepl api calls to celery, and this is the work in progress. But I'm not really happy with it currently. The main problem is that everything that is passed to celery needs to be JSON-serializable, while our ApiClient is designed to have a lot of rich objects as attributes.
I'll probably suspend this for a bit to work on my issues that are actually in the roadmap. But if you're bored, maybe you could have a look and tell me if you come up with a better idea?

@PeterNerlich
Copy link
Copy Markdown
Contributor

The main problem is that everything that is passed to celery needs to be JSON-serializable, while our ApiClient is designed to have a lot of rich objects as attributes

For linkcheck I recently tried to work on a wrapper to point to an arbitrary object by specifying the django app defining the model, the model name, and the id. Wouldn't this suffice here as well?

@dkehne
Copy link
Copy Markdown
Collaborator

dkehne commented Jan 19, 2026

Just some ai-assisted thoughts:

  • The async task maybe can't use messages.error() since there's no request, so users won't know if translations failed.
  • Do I get it right that if you dispatch a Celery task from within a Django view, the task runs in a separate worker process? If you dispatch a Celery task before the transaction commits, the worker might try to fetch objects that aren't visible yet. Peter's linkcheck wraps the task dispatch in transaction.on_commit() which seems a pretty good idea.
  • Budget tracking doesn't work - The region.mt_budget_used update in translate_queryset won't account for async translations since it runs before they complete.
  • Code duplication - The translation logic is now duplicated between sync/async paths. We could consider extracting translate_attr and the save logic into shared functions (what @jonbulz already partially did)

and back to you initial question @jonbulz - one alternative could be creating a TranslationJob model:

  class TranslationJob(models.Model):
      region = models.ForeignKey(Region, on_delete=models.CASCADE)
      source_language = models.ForeignKey(Language, on_delete=models.CASCADE, related_name='+')
      target_language = models.ForeignKey(Language, on_delete=models.CASCADE, related_name='+')
      content_type = models.ForeignKey(ContentType, on_delete=models.CASCADE)
      object_ids = models.JSONField()  # List of IDs to translate

      status = models.CharField(choices=[('pending', 'queued', 'running', 'completed', 'failed')])
      created_at = models.DateTimeField(auto_now_add=True)
      completed_at = models.DateTimeField(null=True)
      error_message = models.TextField(blank=True)

Then the Celery task just receives job_id:

  @shared_task
  def translate_async(job_id: int):
      job = TranslationJob.objects.get(id=job_id)
      # Everything needed is in the job object
      Model = job.content_type.model_class()
      queryset = Model.objects.filter(pk__in=job.object_ids)
      # ... do translation, update job.status when done

Benefits:

  • Solves serialization (just pass one integer)
  • Provides job status tracking for free
  • Users can see progress/errors in the UI
  • Can retry failed jobs
  • Can show translation history

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bulk Action translation fails / times out with large amount of pages/words

4 participants