Skip to content

gitea dump does not respect --tempdir option #9100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 tasks done
nodiscc opened this issue Nov 20, 2019 · 14 comments
Closed
2 tasks done

gitea dump does not respect --tempdir option #9100

nodiscc opened this issue Nov 20, 2019 · 14 comments

Comments

@nodiscc
Copy link
Contributor

nodiscc commented Nov 20, 2019

  • Gitea version (or commit ref): 1.8.0 (attempting to backup before migration)
  • Git version: 1:2.20.1-2
  • Operating system: Debian GNU/Linux 10 Buster
  • Database (use [x]):
    • MySQL
  • Can you reproduce the bug at https://try.gitea.io:
    • Not relevant
  • Log gist:

Description

When specifying the --tempdir option for gitea dump (https://docs.gitea.io/en-us/backup-and-restore/), the database dump seems to ignore the --tempdir option and instead dumps to /tmp/ (or /tmp/$UID/ when using libpam-tmpdir).

In the case, where the /tmp partition is too small for the dump, the backup fails.

$ cat /opt/xsrv/gitea-dump.sh 
#!/bin/bash
# Description: backup/dump script for gitea
set -o errexit
cd /var/backups/gitea/
sudo -u gitea gitea dump --tempdir /var/backups/gitea/ -c /etc/gitea/app.ini

$ /opt/xsrv/gitea-dump.sh
2019/11/20 19:25:37 Creating tmp work dir: /var/backups/gitea/gitea-dump-605371411
2019/11/20 19:25:37 Packing dump files...
2019/11/20 19:25:37 Dumping local repositories.../var/lib/gitea/repos
2019/11/20 19:44:29 Dumping database...
2019/11/20 19:45:20 Failed to save gitea-dump-1574274337.zip: write /tmp/user/998/cae/gitea-dump-1574274337.zip/gitea-repo.zip: no space left on device


$ df -h
...
/dev/mapper/debian--vg-lv--tmp   1.8G  5.7M  1.7G   1% /tmp
/dev/mapper/debian--vg-lv--var    35G   18G   16G  55% /var

I expect that it would dump repositories and the database to the directory specified with --tempdir.

I am on 1.8.0 and attempting to backup before migrating/upgrading to 1.10.0 so if this is fixed in a later release, I'm also interested in any temporary/manual workaround/solution for 1.8.0. (In last resort I could grow the size of /tmp but it's not ideal)

@JacquesOfAllTrades
Copy link

I'm not a Gitea dev, nor have I tested this, but I'd suggest trying

export TMPDIR=/var/backups/gitea/

before running the dump, then run the dump with the --tempdir option still in place.

Based on the code in this code from [https://github.com/go-gitea/gitea/blob/release/v1.8/cmd/dump.go#L81]:

	// work-around #1103
	if os.Getenv("TMPDIR") == "" {
		os.Setenv("TMPDIR", tmpWorkDir)
	}

there's presumably something in the dump process that relies on the TMPDIR environment variable. Gitea will set TMPDIR to the directory you provide via --tempdir, but only if it's not already set on your system.

Could be worth a shot.

@JacquesOfAllTrades
Copy link

Just a quick follow-up: I can confirm that both Gitea 1.9.1 and 1.9.5 respect the --tempdir option on my system. I'm running a command like

$sudo -u git gitea dump -c /path/to/app.ini --tempdir /jacques/backups/temp

and the dump contents are correctly staged in /jacques/backup/temp before being assembled into the final .zip file.

Note that the directory passed to --tempdir has to exist and be writable by your git (or gitea) user.

Also, the TMPDIR environment variable on my system is empty, so the code snippet I included in my previous comment will set it to the value provided to the --tempdir argument. If your TMPDIR is already set before running gitea dump, e.g., to /tmp, it won't be modified.

If clearing your TMPDIR var or setting it to the value you're providing to --tempdir works for you, that does seem like a bug, because it would mean the dump process has an undocumented (and undesirable) dependency on that var.

@nodiscc
Copy link
Contributor Author

nodiscc commented Dec 4, 2019

I have grown the partition so I can't see the disk full error anymore, but

I have changed my dump script to this:

export TMPDIR=/var/backups/gitea/
cd /var/backups/gitea/
sudo -u gitea gitea dump --tempdir /var/backups/gitea/ -c /etc/gitea/app.ini

ran the script and watched files open by the gitea dump process (watch -n 2 sudo lsof -p 19206). It starts backing up repositories and opens /var/backups/gitea/gitea-dump-597340154/gitea-repo.zip, /var/backups/gitea/gitea-dump-1575490742.zip, all fine. At some point it opens

/tmp/user/998/cae/gitea-dump-1575490742.zip
/tmp/user/998/cae/gitea-dump-1575490742.zip/gitea-repo.zip`.

In the process environment I see

$ sudo cat /proc/19206/environ
....TMP=/tmp/user/998TMPDIR=/tmp/user/998TEMP=/tmp/user/998TEMPDIR=/tmp/user/998

I don't know what sets TMPDIR to this path. Maybe it because it's run through sudo. Maybe related to libpam-tmpdir being enabled. I will do more checks.

@JacquesOfAllTrades
Copy link

Maybe it because it's run through sudo.

That's a good thought. You'd probably have to enter a sudo shell as the 'gitea' user before setting TMPDIR and then running gitea dump.

On the other hand, I guess I can't say with absolute certainty that no files are getting put into /tmp at any point during my dump process. I didn't see any files left in /tmp when the dump aborted due to an error (whereas I did see files left in the --tempdir directory), but it's possible that some files got briefly put into /tmp later in the process once my error was fixed.

Of course, all of the rigmarole with the TMPDIR variable was just an attempt to work around your original issue, which I agree looks like a bug in the handling of the --tempdir arg.

@guillep2k
Copy link
Member

TMPDIR is (or should be) considered a security setting, as many programs could leak information if this env variable is crafted maliciously. As a consequence, sudo will most likely clean it up.

See https://serverfault.com/a/479081/54402

@lunny
Copy link
Member

lunny commented Dec 5, 2019

@nodiscc could you try a recent version, i.e. v1.10.0 v1.10.1

@nodiscc
Copy link
Contributor Author

nodiscc commented Dec 6, 2019

I will try ASAP

@stale
Copy link

stale bot commented Feb 6, 2020

This issue has been automatically marked as stale because it has not had recent activity. I am here to help clear issues left open even if solved or waiting for more insight. This issue will be closed if no further activity occurs during the next 2 weeks. If the issue is still valid just add a comment to keep it alive. Thank you for your contributions.

@stale stale bot added the issue/stale label Feb 6, 2020
@nodiscc
Copy link
Contributor Author

nodiscc commented Feb 8, 2020

I no longer use the gitea dump command as it has some crippling limitations in my opinion:

  • it must be run from the gitea data directory which is not easily done from a cron job with proper use of sudo (I always ended up getting permission denied messages or similar). I strongly suggest that you make it able to run from any directory, and not rely on PWD to create logs, etc. directories.
  • zips the archive hence preventing any kind of deduplication by my backup program (rsnapshot)

Instead I simply backup the gitea data directory and database:

# /etc/rsnapshot.d/gitea.conf
# rsnapshot configuration for gitea backups
backup          /var/lib/gitea  localhost/
backup_script   /usr/bin/mysqldump --single-transaction -h localhost -u root gitea > gitea.sql  gitea_db/

@nodiscc nodiscc closed this as completed Feb 8, 2020
@guillep2k
Copy link
Member

I no longer use the gitea dump command as it has some crippling limitations in my opinion:

I strongly suggest that you make it able to run from any directory, and not rely on PWD to create logs, etc. directories.

You can always use the --work-dir flag in the command line, specify a WORK_DIR environment variable or set that up in app.ini.

@nodiscc
Copy link
Contributor Author

nodiscc commented Feb 9, 2020

You can always use the --work-dir flag in the command line

This is good to know, it's not documented at https://docs.gitea.io/en-us/backup-and-restore/ though. Maybe raise another issue?

However the dump command is not really satisfying for me:

  • zipping of the backup requires twice the size of the data directory in disk space on the server (even if I just need to move the zip to another machine, that space is still required during the dump)
  • the dump archive contains another gitea-repo.zip which makes this worse (and also eats disk space when restoring), and is also bad performance-wise (zip in zip)
  • zipping of the backup prevents my backup software from performing file-based deduplication

For these reasons I am switching to the backup procedure I described above (dump the database, backup the db dump and gitea data directory).

@nodiscc
Copy link
Contributor Author

nodiscc commented Feb 9, 2020

In addition, git .pack files are already compressed which makes the zip operation a waste of time, disk space and CPU cycles (.packs in .zip in .zip).

@stale
Copy link

stale bot commented Aug 1, 2020

This issue has been automatically marked as stale because it has not had recent activity. I am here to help clear issues left open even if solved or waiting for more insight. This issue will be closed if no further activity occurs during the next 2 weeks. If the issue is still valid just add a comment to keep it alive. Thank you for your contributions.

@stale stale bot added the issue/stale label Aug 1, 2020
@stale
Copy link

stale bot commented Aug 16, 2020

This issue has been automatically closed because of inactivity. You can re-open it if needed.

@stale stale bot closed this as completed Aug 16, 2020
@go-gitea go-gitea locked and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants