-
-
Notifications
You must be signed in to change notification settings - Fork 2k
MDEV-37949: Implement innodb_log_archive_file_size, innodb_log_archive_path, … #4405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
dr-m
wants to merge
1
commit into
11.4
Choose a base branch
from
MDEV-37949
base: 11.4
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+2,395
−402
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
|
4a9a384 to
9d14e2c
Compare
1ab3880 to
254bfaa
Compare
This was referenced Jan 19, 2026
The InnoDB write-ahead log file (ib_logfile0) is pre-allocated to innodb_log_file_size and written as a ring buffer. This is good for write performance and space management, but unsuitable for arbitrary point-in-time recovery or for facilitating incremental backup. TODO: Implement multi-file recovery (currently, only 2-file) TODO: Implement a test to show that specifying innodb_log_recovery_target makes InnoDB read-only. TODO: Test and fix the crash-safety and recovery of SET GLOBAL innodb_log_archive. Implement proper recovery when the file header of ib_%016x.log is in the innodb_log_archive=OFF format. Ensure that the start LSN of the ib_logfile0 will be written in a way that is compatible with the already written log_sys.get_sequence_bit(). TODO: Enforce a reasonable maximum innodb_log_file_size for the innodb_log_archive=ON format. 4 GiB, perhaps? Maybe, store straight 32-bit file offsets in the checkpoint header. TODO: FIXME comment in recv_sys_t::find_checkpoint_archived() innodb_log_archive=ON: A new format where InnoDB will create and preallocate files ib_%016x.log instead of writing a circular file ib_logfile0. The file name includes the log sequence number (LSN) at file offset 12288 (log_t::START_OFFSET). Each file will be pre-allocated to innodb_log_file_size. Once the log fills up, we will create and pre-allocate another log file, to which log records will be written. Upon the completion of the first log checkpoint in a recently created log file, the old log file will be marked read-only, signaling that there will be no further writes to that file, and that the file may safely be moved to long-term storage. innodb_log_recovery_start: The checkpoint LSN to start recovery from. This will be useful when recovering from an archived log. This is useful for restoring an incremental backup (applying InnoDB log files that were copied since the previous restore). innodb_log_recovery_target: The requested LSN to end recovery at. This will be useful when recovering data files that were copied as of a time that is before end of the available log. When this parameter is set, InnoDB will be read-only. The status variable innodb_lsn_archived will reflect the LSN since when a complete InnoDB log archive is available. Its initial value will be that of the new parameter innodb_log_archive_start. If that variable is 0 (the default), the innodb_lsn_archived will be recovered from the available log files. If innodb_log_archive=OFF, innodb_lsn_archived will be adjusted to the latest checkpoint every time a log checkpoint is executed. If innodb_log_archive=ON, the value should not change. The new setting SET GLOBAL innodb_log_archive=ON will enable log archiving as soon as the current ib_logfile0 is about to wrap around. SET GLOBAL innodb_log_archive=OFF will immediately rewrite the checkpoint header in the latest ib_%016x.log and rename the file to ib_logfile0. When innodb_log_archive=ON, the setting SET GLOBAL innodb_log_file_size will affect subsequently created log files when the file that is being currently written is running out. no_checkpoint_prepare.inc: A new file, to prepare for subsequent inclusion of no_checkpoint_end.inc. We will invoke the server to parse the log and to determine the latest checkpoint. All --suite=encryption tests that use innodb_encrypt_log will be skipped for innodb_log_encrypt=ON, because enabling or disabling encryption on the log is not possible without temporarily setting innodb_log_archive=OFF and restarting the server. The idea is to add the following arguments to an invocation of mysql-test/mtr: --mysqld=--loose-innodb-log-archive --skip-test=mariabackup The mariabackup test suite must be skipped when using the innodb_log_archive=ON format, because mariadb-backup will only support the old ib_logfile0 format (innodb_log_archive=OFF). log_sys.first_lsn: The start of the current log file, to be consulted in log_t::write_checkpoint() when renaming files. log_sys.archived_lsn: New field: The value of innodb_lsn_archived. log_sys.end_lsn: New field: The log_sys.get_lsn() when the latest checkpoint was initiated. That is, the start LSN of a possibly empty sequence of FILE_MODIFY records followed by FILE_CHECKPOINT. log_sys.resize_target: The value of innodb_log_file_size that will be used for creating the next archive log file once the current file (of log_sys.file_size) fills up. log_sys.archive: New field: The value of innodb_log_archive. log_sys.next_checkpoint_no: Widen to uint16_t. There may be up to 12288/4=3072 checkpoints in the header. log_sys.log: If innodb_log_archive=ON, this file handle will be kept open also in the PMEM code path. log_sys.resize_log: If innodb_log_archive=ON, we may have two log files open both during normal operation and when parsing the log. This will store the other handle (old or new file). log_sys.resize_buf: In the memory-mapped code path, this will point to the file resize_log when innodb_log_archive=ON. log_t::archive_new_write(): Create and allocate a new log file, and write the outstanding data to both the current and the new file. log_t::archived_mmap_switch_prepare(): Create and memory-map a new log file, and update file_size to resize_target. Remember the file handle of the current log in resize_log, so that write_checkpoint() will be able to make it read-only. log_t::archived_mmap_switch_complete(): Switch to the buffer that was created in archived_mmap_switch_prepare(). log_t::write_checkpoint(): Allow an old checkpoint to be completed in the old log file even after a new one has been created. If we are writing the first checkpoint in a new log file, we will mark the old log file read-only. We will also update log_sys.first_lsn unless it was already updated in ARCHIVED_MMAP code path. In that code path, there is the special case where log_sys.resize_buf == nullptr and log_sys.checkpoint_buf points to log_sys.resize_log (the old log file that is about to be made read-only). In this case, log_sys.first_lsn will already point to the start of the current log_sys.log, even though the switch has not been fully completed yet. log_t::header_rewrite(my_bool): Rewrite the log file header before or after renaming the log file. The recovery of the last ib_%016%.log file must tolerate also the ib_logfile0 format. log_t::set_archive(my_bool): Implement SET GLOBAL innodb_log_archive. An error will be returned if non-archived SET GLOBAL innodb_log_file_size (log file resizing) is in progress. The current log file will be renamed to either ib_logfile0 or ib_%016x.log, as appropriate. log_t::archive_set_size(): A new function, to ensure that log_sys.resize_target is set on startup. log_checkpoint_low(): Do not prevent a checkpoint at the start of a file. We want the first innodb_log_archive=ON file to start with a checkpoint. log_t::create(lsn_t): Initialize last_checkpoint_lsn. Initialize the log header as specified by log_sys.archive (innodb_log_archive). log_write_buf(): Add the parameter max_length, the file wrap limit. mtr_t::finish_writer(): Specialize for innodb_log_archive=ON log_t::append_prepare<log_t::ARCHIVED_MMAP>(): Special case. log_t::get_path(): Get the name of the current log file. log_t::get_circular_path(size_t): Get the path name of a circular file. Replaces get_log_file_path(). log_t::get_archive_path(lsn_t): Return a name of an archived log file. log_t::get_next_archive_path(): Return the name of the next archived log. log_t::append_archive_name(): Append the archive log file name to a path string. mtr_t::finish_writer(): Invoke log_close() only if innodb_log_archive=OFF. In the innodb_log_archive=ON, we only force log checkpoints after creating a new archive file, to ensure that the first checkpoint will be written as soon as possible. log_t::checkpoint_margin(): Replaces log_checkpoint_margin(). If a new archived log file has been created, wait for the first checkpoint in that file. srv_log_rebuild_if_needed(): Never rebuild if innodb_log_archive=ON. The setting innodb_log_file_size will affect the creation of subsequent log files. The parameter innodb_encrypt_log cannot be changed while the log is in the innodb_log_archive=ON format. log_t::attach(), log_mmap(): Add the parameter bool read_only. recv_sys_t::find_checkpoint(): If the circular ib_logfile0 is missing, determine the oldest archived log file with contiguous LSN. If innodb_log_archive=ON, refuse to start if ib_logfile0 exists. Open non-last archived log files in read-only mode. recv_sys_t::find_checkpoint_archived(): Validate each checkpoint in the current file header, and by default aim to recover from the last valid one. Terminate the search if the last validated checkpoint spanned two files. log_parse_file(): Do not invoke fil_name_process() during recv_sys_t::find_checkpoint_archived(), when we tolerate FILE_MODIFY records while looking for a FILE_CHECKPOINT record. recv_scan_log(): Invoke log_t::archived_switch_recovery() upon reaching the end of the current archived log file. log_t::archived_switch_recovery(): Switch files in the pread() code path. log_t::archived_mmap_switch_recovery_complete(): Switch files in the memory-mapped code path. recv_warp: A pointer wrapper for memory-mapped parsing that spans two archive log files. recv_sys_t::parse_mmap(): Use recv_warp for innodb_log_archive=ON. recv_sys_t::parse(): Tweak some logic for innodb_log_archive=ON. log_t::set_recovered_checkpoint(): Set the checkpoint on recovery. Updates also the end_lsn. log_t::clear_mmap(): Clean up the logic. log_t::persist(): Even if the flushed_to_disk_lsn does not change, we may want to reset the write_lsn_offset.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
TODO: fill description here
Release Notes
TODO: What should the release notes say about this change?
How can this PR be tested?
The
mariabackupsuite must be skipped when settinginnodb_log_archive=ON, because themariadb-backuptool will only support the oldinnodb_log_archive=OFFformat (copying fromib_logfile0).Unfortunately, all
--suite=encryptiontests that useinnodb_encrypt_logmust be skipped when usinginnodb_log_archive. This is because the server would have to be reinitialized; we do not allow changing the format of an archived log on startup (such as adding or removing encryption). This combination is covered by the testinnodb.log_file_size_online,encrypted.Basing the PR against the correct MariaDB version
mainbranch.This is a new feature, but for now based on the 11.4 branch so that any unrelated errors that may be found during testing can be fixed rather quickly. Merges to the
mainbranch may be blocked for weeks at a time.PR quality check