You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add expand_archive() and expand_archive_streaming() for extracting
ZIP/TAR archives directly into MFS with auto-detection and conflict handling.
- add _archive.py with ArchiveAdapter base class, ZipAdapter, and TarAdapter
- expand_archive() provides atomic extraction via import_tree()
- expand_archive_streaming() provides low-memory streaming extraction
- add _sanitize_archive_path() for Zip Slip prevention
- add on_conflict parameter for duplicate and collision handling
- re-export ArchiveAdapter, expand_archive, expand_archive_streaming from __init__
- add 67 tests across 5 new test files (total: 436)
- update README, README_ja, TESTING, TESTING_ja, CHANGELOG, and examples
- add archive-related Non-Goals to README compatibility section
| 🧪 **Robustness**|369 tests with 97% code coverage |
20
+
| 🧪 **Robustness**|436 tests with 97% code coverage |
21
21
| 🔒 **Verified Safety**| 98, 100×4 — top scores across all security categories (Socket.dev) |
22
22
| 🌟 **Community**|[Discussed on `r/Python`](https://www.reddit.com/r/Python/comments/1rrqr8z/i_built_an_inmemory_virtual_filesystem_for_python/) with highly positive reception |
- Async wrapper (`AsyncMemoryFileSystem`) powered by `asyncio.to_thread`
35
35
- Zero runtime dependencies (standard library only)
36
36
-**No admin/root privileges required** — works on locked-down CI runners, containers, and shared machines where OS-level RAM disks are not an option
37
-
-**369 tests, 97% coverage** across 3 OS (Linux / Windows / macOS) × 3 Python versions (3.11–3.13, including free-threaded 3.13t)
37
+
-**436 tests, 97% coverage** across 3 OS (Linux / Windows / macOS) × 3 Python versions (3.11–3.13, including free-threaded 3.13t)
38
38
39
39
This is useful when `io.BytesIO` is too primitive (single buffer), and OS-level RAM disks/tmpfs are impractical (permissions, container policy, Windows driver friction). Ideal for **CI pipeline acceleration** — eliminate disk I/O from test suites and data processing without any infrastructure changes.
40
40
41
41
**Note on Architectural Boundary:** This is strictly an in-process tool. External subprocesses (CLI tools) cannot access these files via standard OS paths. If your pipeline relies heavily on passing files to external binaries, an OS-level RAM disk (`tmpfs`) is the correct tool. D-MemFS shines when accelerating Python-native test suites or internal data pipelines.
42
42
43
43
---
44
44
45
-
### Archive Extraction In-Memory
46
-
Extract large ZIP or TAR archives entirely in-memory to process their contents on the fly. Prevent disk wear (TBW) and eliminate the risk of leaving garbage files behind.
45
+
### Archive Extraction
46
+
Extract ZIP/TAR archives directly into D-MemFS using the built-in `expand_archive()` (atomic, all-or-nothing) or `expand_archive_streaming()` (low-memory, incremental). Custom archive formats are supported via the pluggable `ArchiveAdapter` interface. A low-level manual extraction example using `open()`/`write()` is also included as a reference for advanced use cases.
- No symlink/hardlink support — intentionally omitted to eliminate path traversal loops and structural complexity (same rationale as `pathlib.PurePath`).
323
329
- No direct `pathlib.Path` / `os.PathLike` API — MFS paths are virtual and must not be confused with host filesystem paths. Accepting `os.PathLike` would allow third-party libraries or a plain `open()` call to silently treat an MFS virtual path as a real OS path, potentially issuing unintended syscalls against the host filesystem. All paths must be plain `str` with POSIX-style absolute notation (e.g. `"/data/file.txt"`).
324
330
- No kernel filesystem integration (intentionally in-process only)
331
+
- No exhaustive archive format support — core handles zip and tar (standard library) only. For other formats (7z, RAR, etc.), you can write your own adapter. See [`examples/archive_extraction.md`](examples/archive_extraction.md) for details.
332
+
- No password-protected / encrypted archive support
333
+
- Archive extraction functions are sync-only. Use `asyncio.to_thread()` in async code.
0 commit comments