Skip to content

Add option to add a data disk to an instance #759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

afbjorklund
Copy link
Member

@afbjorklund afbjorklund commented Mar 26, 2022

The data disk will show up after the root disk

Typically as /dev/vdb, when running with virtio

For #722

User needs to partition, format and mount the disk...


# Disk size
# 🟢 Builtin default: "100GiB"
disk: null

# Data size
# 🟢 Builtin default: "0"
data: "10GiB"
Disk /dev/vda: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk /dev/vdb: 10 GiB, 10737418240 bytes, 20971520 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

The data disk will show up after the base disk

Typically as /dev/vdb, when running with virtio

Signed-off-by: Anders F Björklund <[email protected]>
@jandubois
Copy link
Member

I think we would want to exclude the "data disk" from the list of files deleted by the factory-reset command.

I have not tested this with Alpine yet; I want to make sure that the "implict" data volume continues to be the "diff disk" and not the additional "data disk".

@jandubois jandubois self-requested a review March 26, 2022 19:06
@@ -39,6 +39,10 @@ memory: null
# 🟢 Builtin default: "100GiB"
disk: null

# Data size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs additional documentation to explain that this is a supplemental data volume, and the parameter is the size, and can use size-related units like "GiB". Maybe the builtin default should change to "0GiB" to make this obvious?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The datadisk also needs to be mentioned in docs/internal.md.

@jandubois
Copy link
Member

I'm not sure if there would ever be a reason to create more than one additional data volume? In which case data should be a list and not a scalar.

I can't think of anything right now, but don't want us to paint ourselves into a corner that will require incompatible schema changes in the future.

@jandubois
Copy link
Member

I think we would want to exclude the "data disk" from the list of files deleted by the factory-reset command.

sighs

On second thoughts, this is kind of weird, as "factory-reset" implies that all data is erased. On the other hand, the main reason to have a separate data volume is to be able to persist data beyond the lifetime of the instance itself.

Maybe factory-reset should display a prompt when a data volume exists? And then have a --keep-data-volume option that can be set to true/false to avoid the prompt? Just brainstorming right now.

@afbjorklund
Copy link
Member Author

afbjorklund commented Mar 27, 2022

I have not tested this with Alpine yet; I want to make sure that the "implict" data volume continues to be the "diff disk" and not the additional "data disk".

The persistent disk for distributions that boot from a LiveCD was not considered as a "data" disk, it was more regarded as being a part of the root (or "OS") disk - using the cloud parlance*. Even though it mounts under /mnt/data and is called data-volume...

lima-alpine:/home/anders/lima$ sudo fdisk -l
Disk /dev/vda: 100 GB, 107374182400 bytes, 209715200 sectors
366634 cylinders, 13 heads, 44 sectors/track
Units: sectors of 1 * 512 = 512 bytes

Device  Boot StartCHS    EndCHS        StartLBA     EndLBA    Sectors  Size Id Type
/dev/vda1    2,0,33      178,12,44         2048  209715199  209713152 99.9G 83 Linux
Disk /dev/vdb: 10 GB, 10737418240 bytes, 20971520 sectors
20805 cylinders, 16 heads, 63 sectors/track
Units: sectors of 1 * 512 = 512 bytes

Disk /dev/vdb doesn't contain a valid partition table

But you are right, the naming could be confusing. When compare with 05-persistent-data-volume.sh and friends. It was normally just called the "persistence partition" and mounted under /mnt/sda1 or somesuch back when I was using machine.

And it could well be more than one.


* https://docs.microsoft.com/en-us/azure/virtual-machines/managed-disks-overview#disk-roles

  • OS disk
  • Temporary disk
  • Data disk(s)

@afbjorklund afbjorklund marked this pull request as draft March 27, 2022 07:16
@@ -39,6 +39,10 @@ memory: null
# 🟢 Builtin default: "100GiB"
disk: null

# Data size
# 🟢 Builtin default: "0"
data: null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"data" isn't descriptive because everything is data 😛

maybe additionalDisks (slice) ?

Copy link
Member Author

@afbjorklund afbjorklund Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, "disk" is also kind of vague. And "data" was ambiguous.

Using "extra" or "additional" could work, perhaps.

@AkihiroSuda
Copy link
Member

Closes #722

722 probably requires supporting custom disk image path

@AkihiroSuda AkihiroSuda added the enhancement New feature or request label Mar 28, 2022
@chancez
Copy link
Contributor

chancez commented Mar 28, 2022

Here are my thoughts:

  • I agree data is a bit vague. I think disks is better, or devices and using an object you could define the type of device, and other options (maybe you expose all the -drive options directly). Eg: devices: [{type: virtio, path: "/path/to/somewhere"}].
  • It would be good to support more than 1 disk. So instead of disk: being a single value, perhaps disks could become a list of objects. Eg: data: [ {size: 8GiB }]
  • In the disk's configuration you could configure if it should be wiped on a factory reset. Eg: {size: 8GiB, deletionPolicy: "delete|keep"}

Altogether:

devices:
- type: virtio # maybe should be qcow2?
   size: 8GiB
- size: 100GiB
   deletionPolicy: keep # don't delete on factory reset
   name: "home" # maybe use the name as the filename in the Lima VM directory if path isn't specified?
   path: /volumes/usbstick/vm-home.qcow2
- size: 15GiB # simple disk, only size is actually required.

@jandubois
Copy link
Member

  • In the disk's configuration you could configure if it should be wiped on a factory reset. Eg: {size: 8GiB, deletionPolicy: "delete|keep"}

As I wrote in #722 (comment), I think it will be better if we manage data volumes independent of instances. Because I may want to keep the data even when I delete the instance, and not just factory reset it. And I may want to read the data from a different instance than the one that created it.

So I think limectl volume create|ls|destroy make more sense, and then you can attach the disk to your instance via lima.yaml. And we can eventually extend the volume functionality by supporting things like resizing and cloning volumes.

@afbjorklund
Copy link
Member Author

Okay, I guess we can consider the proof-of-concept concluded then (can do an array of disks, but seems it wants more)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants