Skip to content

Allow PUT updates before guest boot and deny them post boot #263

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 22, 2018
Merged

Allow PUT updates before guest boot and deny them post boot #263

merged 4 commits into from
May 22, 2018

Conversation

alxiord
Copy link

@alxiord alxiord commented May 10, 2018

Changes

  • before the guest boots, PUT requests can update the entities
  • after the guest boots, PUT requests are forbidden.
    • except for /drives, where a subsequent PUT triggers a rescan. This is a temporary solution and will be replaced with something more RESTful
  • refactored vmm::run_api_cmd moving each match branch to a separate handler function
  • added unit tests in the changed files to improve coverage
  • added integration tests for the update scenario

Testing done

Build

cargo fmt --all
cargo build
cargo build --release

Unit tests

sudo env "PATH=$PATH" cargo test --all
sudo env "PATH=$PATH" cargo kcov --all

Coverage increased to 49.6%

Manual tests

# Add boot source
$ curl --unix-socket /tmp/firecracker.socket -i  \
>      -X PUT "http://localhost/boot-source" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>            \"boot_source_id\": \"alinux_kernel\",
>            \"source_type\": \"LocalImage\", 
>            \"local_image\": 
>                 { 
>                     \"kernel_image_path\": \"${kernel_path}\" 
>                 }
>         }"
HTTP/1.1 201 Created
Content-Length: 0
Date: Thu, 10 May 2018 10:16:11 GMT

# Attempt update with invalid path
$ curl --unix-socket /tmp/firecracker.socket -i  \
>      -X PUT "http://localhost/boot-source" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>            \"boot_source_id\": \"alinux_kernel\",
>            \"source_type\": \"LocalImage\", 
>            \"local_image\": 
>                 { 
>                     \"kernel_image_path\": \"foo.bar\" 
>                 }
>         }"
HTTP/1.1 400 Bad Request
Content-Type: application/json
Transfer-Encoding: chunked
Date: Thu, 10 May 2018 10:16:11 GMT

{
  "fault_message": "The kernel path is invalid!"
}


# Update path
$ curl --unix-socket /tmp/firecracker.socket -i  \
>      -X PUT "http://localhost/boot-source" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>            \"boot_source_id\": \"alinux_kernel\",
>            \"source_type\": \"LocalImage\", 
>            \"local_image\": 
>                 { 
>                     \"kernel_image_path\": \"${kernel_path_copy}\" 
>                 }
>         }"
HTTP/1.1 204 No Content
Date: Thu, 10 May 2018 10:16:11 GMT

# Update machine config
$ curl --unix-socket /tmp/firecracker.socket -i \
>      -X PUT "http://localhost/machine-config" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ \"vcpu_count\": 4, \"mem_size_mib\": 256}"
HTTP/1.1 204 No Content
Date: Thu, 10 May 2018 10:16:11 GMT

# Add root block device
$ curl --unix-socket /tmp/firecracker.socket -i \
>      -X PUT "http://localhost/drives/read_only_drive" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>             \"drive_id\": \"root\",
>             \"path_on_host\": \"${rootfs_path}\", 
>             \"is_root_device\": true, 
>             \"permissions\": \"rw\", 
>             \"state\": \"Attached\"
>          }"
HTTP/1.1 201 Created
Content-Length: 0
Date: Thu, 10 May 2018 10:16:11 GMT

# Attempt update with invalid path
$ curl --unix-socket /tmp/firecracker.socket -i \
>      -X PUT "http://localhost/drives/read_only_drive" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>             \"drive_id\": \"root\",
>             \"path_on_host\": \"foo.bar\", 
>             \"is_root_device\": true, 
>             \"permissions\": \"rw\", 
>             \"state\": \"Attached\"
>          }"
HTTP/1.1 400 Bad Request
Content-Type: application/json
Transfer-Encoding: chunked
Date: Thu, 10 May 2018 10:16:11 GMT

{
  "fault_message": "Invalid block device path!"
}

# Update path
$ curl --unix-socket /tmp/firecracker.socket -i \
>      -X PUT "http://localhost/drives/read_only_drive" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>             \"drive_id\": \"root\",
>             \"path_on_host\": \"${rootfs_path_copy}\", 
>             \"is_root_device\": true, 
>             \"permissions\": \"rw\", 
>             \"state\": \"Attached\"
>          }"
HTTP/1.1 204 No Content
Date: Thu, 10 May 2018 10:16:23 GMT

# Add readonly device
$ curl --unix-socket /tmp/firecracker.socket -i \
>      -X PUT "http://localhost/drives/read_only_drive" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>             \"drive_id\": \"read_only_drive\",
>             \"path_on_host\": \"${ro_drive_path}\", 
>             \"is_root_device\": false, 
>             \"permissions\": \"rw\", 
>             \"state\": \"Attached\"
>          }"
HTTP/1.1 201 Created
Content-Length: 0
Date: Thu, 10 May 2018 10:16:23 GMT

# Attempt to update readonly device as root device
$ curl --unix-socket /tmp/firecracker.socket -i \
>      -X PUT "http://localhost/drives/read_only_drive" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>             \"drive_id\": \"read_only_drive\",
>             \"path_on_host\": \"${ro_drive_path}\", 
>             \"is_root_device\": true, 
>             \"permissions\": \"rw\", 
>             \"state\": \"Attached\"
>          }"
HTTP/1.1 400 Bad Request
Content-Type: application/json
Transfer-Encoding: chunked
Date: Thu, 10 May 2018 10:16:23 GMT

{
  "fault_message": "A root block device already exists!"
}

# Create tap interfaces
$ ip tuntap add name vmtap33 mode tap
$ ifconfig vmtap33 192.168.241.1/24 up
$ ip tuntap add name vmtap34 mode tap
$ ifconfig vmtap33 192.169.241.1/24 up

# Add tap interface
$ curl --unix-socket /tmp/firecracker.socket -i \
>      -X PUT "http://localhost/network-interfaces/1" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>             \"iface_id\": \"1\", 
>             \"host_dev_name\": \"vmtap33\", 
>             \"guest_mac\": \"06:00:00:00:00:01\", 
>             \"state\": \"Attached\" 
>         }"
HTTP/1.1 201 Created
Content-Length: 0
Date: Thu, 10 May 2018 10:16:23 GMT

# Update interface
$ curl --unix-socket /tmp/firecracker.socket -i \
>      -X PUT "http://localhost/network-interfaces/1" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>             \"iface_id\": \"1\", 
>             \"host_dev_name\": \"vmtap34\", 
>             \"guest_mac\": \"06:00:00:00:00:01\", 
>             \"state\": \"Attached\" 
>         }"
HTTP/1.1 204 No Content
Date: Thu, 10 May 2018 10:16:23 GMT

# Start instance
$ curl --unix-socket /tmp/firecracker.socket -i \
     -X PUT "http://localhost/actions/start" \
     -H  "accept: application/json" \
     -H  "Content-Type: application/json" \
     -d "{  
            \"action_id\": \"start\",  
            \"action_type\": \"InstanceStart\"
         }"
HTTP/1.1 201 Created
Content-Length: 0
Date: Thu, 10 May 2018 10:19:11 GMT

$ curl --unix-socket /tmp/firecracker.socket -i \
     -X GET "http://localhost/actions/start" \
     -H "accept: application/json"
HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
Date: Thu, 10 May 2018 10:22:34 GMT

{"action_id":"start","action_type":"InstanceStart","timestamp":0}

# Attempt to update boot source
$ curl --unix-socket /tmp/firecracker.socket -i  \
>      -X PUT "http://localhost/boot-source" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>            \"boot_source_id\": \"alinux_kernel\",
>            \"source_type\": \"LocalImage\", 
>            \"local_image\": 
>                 { 
>                     \"kernel_image_path\": \"${kernel_path}\" 
>                 }
>         }"
HTTP/1.1 403 Forbidden
Content-Type: application/json
Transfer-Encoding: chunked
Date: Thu, 10 May 2018 10:23:00 GMT

{
  "fault_message": "The update operation is not allowed"
}

# Attempt to update machine config
$ curl --unix-socket /tmp/firecracker.socket -i \
>      -X PUT "http://localhost/machine-config" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ \"vcpu_count\": 4, \"mem_size_mib\": 256}"
HTTP/1.1 403 Forbidden
Content-Type: application/json
Transfer-Encoding: chunked
Date: Thu, 10 May 2018 10:23:07 GMT

{
  "fault_message": "The update operation is not allowed"
}

# Attempt to update drive; this triggers a rescan
$ curl --unix-socket /tmp/firecracker.socket -i \
>      -X PUT "http://localhost/drives/read_only_drive" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>             \"drive_id\": \"read_only_drive\",
>             \"path_on_host\": \"${ro_drive_path}\", 
>             \"is_root_device\": false, 
>             \"permissions\": \"rw\", 
>             \"state\": \"Attached\"
>          }"
HTTP/1.1 204 No Content
Date: Thu, 10 May 2018 10:23:21 GMT

# Attempt to update network interface
$ curl --unix-socket /tmp/firecracker.socket -i \
>      -X PUT "http://localhost/network-interfaces/1" \
>      -H "accept: application/json" \
>      -H "Content-Type: application/json" \
>      -d "{ 
>             \"iface_id\": \"1\", 
>             \"host_dev_name\": \"vmtap33\", 
>             \"guest_mac\": \"06:00:00:00:00:01\", 
>             \"state\": \"Attached\" 
>         }"
HTTP/1.1 403 Forbidden
Content-Type: application/json
Transfer-Encoding: chunked
Date: Thu, 10 May 2018 10:23:30 GMT

{
  "fault_message": "The update operation is not allowed"
}

Bash test (validation that stuff still works)

$ ./test_all.sh $PWD/../../resources/kernels/vmlinux_login_network.bin $PWD/../../resources/amis/ami-fea26484.ext4 $PWD/../../rootfs.ext4 
HTTP/1.1 201 Created
Content-Length: 0
Date: Mon, 07 May 2018 13:12:02 GMT

HTTP/1.1 204 No Content
Date: Mon, 07 May 2018 13:12:02 GMT

HTTP/1.1 201 Created
Content-Length: 0
Date: Mon, 07 May 2018 13:12:02 GMT

HTTP/1.1 201 Created
Content-Length: 0
Date: Mon, 07 May 2018 13:12:02 GMT

HTTP/1.1 201 Created
Content-Length: 0
Date: Mon, 07 May 2018 13:12:02 GMT

HTTP/1.1 201 Created
Content-Length: 0
Date: Mon, 07 May 2018 13:12:02 GMT

HTTP/1.1 201 Created
Content-Length: 0
Date: Mon, 07 May 2018 13:12:02 GMT

HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
Date: Mon, 07 May 2018 13:12:02 GMT

{"action_id":"start","action_type":"InstanceStart","timestamp":0}

Integration tests

96.61s setup    functional/test_api.py::test_api_put_update_pre_boot[new-lambda]
94.08s setup    functional/test_api.py::test_api_put_update_post_boot[new-lambda]
85.57s setup    functional/test_api.py::test_api_put_update_post_boot[old-lambda]
85.44s setup    functional/test_api.py::test_api_put_update_post_boot[ubuntu]
81.79s setup    functional/test_api.py::test_api_put_update_pre_boot[old-lambda]
=============================================================================== 6 passed in 539.66 seconds ===============================================================================

@alxiord alxiord changed the title Api updates Allow PUT updates before guest boot and deny them post boot May 10, 2018
@alxiord alxiord requested a review from a team May 10, 2018 15:50
@alxiord alxiord self-assigned this May 10, 2018
@alxiord alxiord added this to the Customer 1 Production milestone May 10, 2018
Copy link
Member

@andreeaflorescu andreeaflorescu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an overview, I am really not a big fan of having commits that change more than 1 thing. I would split the commits in logical units. One example would be the first commit where you added the UpdateNotAllowed and also splitted the run_api_cmd function. This can be a problem for example when we discover a bug in a commit and we want to revert it. We would have to also revert code that was not broken. If it is too much work to split the commits, you can leave them as they are. But for future commits, I say we try to have only 1 logical change/per commit where it is possible.


for cfg in self.config_list.iter_mut() {
if cfg.drive_id == block_device_config.drive_id {
cfg.is_root_device = block_device_config.is_root_device;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you allow update on fields like read_only and root_device you also need to update these fields in the config_list. If the device you update is the root device and you want to to make it non-root, you have to update both has_root_block and read_only_root

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the flags in the config_list accordingly.

.is_ok()
);
let ret = block_devices_configs.add(dummy_block_device.clone());
// clean up before the assert!s to make sure the temp files are deleted
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are you making sure that the temp files are deleted if you move the assert here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'm not verifying in any way that the temp files get cleaned up, but by moving the cleanup before the assert!, it will take place before the test panics in case of failure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I guess this works for this particular tests. We usually put the asserts right after the operation because it is more clear what you want to test.

}

#[test]
fn test_has_ro_root() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test module is almost twice the size of the actual implementation of drives. I don't think we need to be so explicit in tests because if we have such a long test module, nobody will actually go over it and update it. With that in mind, I would remove the following tests: test_has_ro_root, test_contains_drive_path and test_contains_drive_id. Is relatively easy to test the corresponding function with a simple assert in the existing tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed test_has_ro_root, test_contains_drive_path and test_contains_drive_id. Moved the relevant assert!s in pre-existing tests.

@@ -1,5 +1,5 @@
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to change the commit message, you misspelled updates

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -91,3 +91,196 @@ def test_api_happy_start(test_microvm_any, uhttp):
)
""" Issues a power-on command to the microvm. """
assert(uhttp.is_good_response(response.status_code))


def test_api_put_update_pre_boot(test_microvm_any, uhttp):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to add a timeout limit? Similar to what we have for test_api_happy_start?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also a default timeout of 300s that applies to all test cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaving it to the default of 300 as files are moved around to validate correct updates and this takes a while. I haven't measured how long it would take on an i3p, but on my dev machine I got between 10-30s extra time for copying the rootfs. Of course this varies with the size of the rootfs so I left the timeout at a generous 300s.

@@ -129,26 +182,29 @@ mod tests {
fn test_add_non_root_block_device() {
let dummy_filename = String::from("test_add_non_root_block_device");
let dummy_path = create_dummy_path(dummy_filename.clone());
let dummy_id = String::from("1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more of a general observation, rather than being specifically about this PR, but it turns out there are ways to create temp files which are automatically deleted & stuff. In Rust, we could use something like the tempfile crate. If we decide to start using it at some point, we can add it to [dev-dependencies] in the appropriate .toml file, and extern crate it only in test modules, so it's completely ignored outside of tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding this as an issue because we have many tests that use tempfiles.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logged #281 to track this.

@@ -90,6 +90,7 @@ pub enum Error {
GuestCIDAlreadyInUse,
GuestMacAddressInUse,
OpenTap(TapError),
UpdateNotAllowed,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to rename this, or add another enum element/error message along the lines of UpdateNotAllowedPostBoot? Receiving UpdateNotAllowed can make one believe that the update operation is forbidden in general for the given resource :-s Just throwing this out there :-s :-s

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to UpdateNotAllowedPostBoot.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The rename went in the next commit during the rebase.)

vmm/src/lib.rs Outdated

fn handle_put_drive(&mut self, drive_description: DriveDescription, sender: SyncOutcomeSender) {
match self.put_block_device(BlockDeviceConfig::from(drive_description)) {
Ok(outcome) => sender
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following piece of code:

sender
    .send(Box<Something>)
    .map_err(|_| ())
    .expect("one-shot channel closed")

appears an awful lot throughout the code. We could make the code prettier by having a helper function which hides having to type these lines again and again. If we want to go to the next beauty level, we could implement the GenerateResponse trait for a Result, like this:

impl<V, E> GenerateResponse for Result<V, E> 
    where V: GenerateResponse,
               E: GenerateResponse {
    fn generate_response(&self) -> Response {
        return match self {
            Ok(v) => v.generate_response(),
            Err(e) => e.generate_response() 
        }
    }
}

This basically says that if both the Ok and the Err parts of a Result implement GenerateResponse, then the Result itself implements GenerateResponse, so we can send the Result variable through the channel without extracting the actual inner value first. Thus, matches like this one would no longer be necessary :-s

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but it also seems like we should add an issue for this because is not directly related to this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logged #282 to track this. I agree with the solution, but it's a solution for a different issue than the one addressed in this PR.

} else {
// There are no root block devices in the list.
self.has_root_block = block_device_config.is_root_device;
self.read_only_root = block_device_config.is_read_only;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The read_only_root should only be set in case the device you are updating is root block. This code seems to be with lots of corner cases. After we implement the PATCH update and add an operation for "refreshing" the size of the drive, I propose removing this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, my bad. I agree that it's turning into if-else-if-else spaghetti and can't wait to see it gone. I logged #280 to track it.

return Err(DriveError::InvalidBlockDevicePath);
}

let root_id = self.get_root_id();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code starting from here it is a bit more complicated than it should. I would restructure it a bit and first get the drive you want to update from the list.
Then you would only have 2 cases:

  1. If the device is root or wants to be root, update has_read_only_root and has_root_device based on the input
  2. If the update is not possible because a root device already exists -> error
    After those two checks, you just update all the fields from the device you got from BlockDeviceConfigs according to the input.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote this section per your suggestion, take a look

@@ -129,26 +182,29 @@ mod tests {
fn test_add_non_root_block_device() {
let dummy_filename = String::from("test_add_non_root_block_device");
let dummy_path = create_dummy_path(dummy_filename.clone());
let dummy_id = String::from("1");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest adding this as an issue because we have many tests that use tempfiles.

vmm/src/lib.rs Outdated

fn handle_put_drive(&mut self, drive_description: DriveDescription, sender: SyncOutcomeSender) {
match self.put_block_device(BlockDeviceConfig::from(drive_description)) {
Ok(outcome) => sender
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, but it also seems like we should add an issue for this because is not directly related to this PR.

Alexandra Ghecenco added 4 commits May 21, 2018 13:40
Split run_api_cmd into smaller functions handling each match branch
to improve readability and avoid duplicate code.

Signed-off-by: Alexandra Ghecenco <[email protected]>
vmm returns 403 Forbidden to PUT operations issued after the guest has booted,
except for block devices where live resize is supported.

Signed-off-by: Alexandra Ghecenco <[email protected]>
* implemented update with PUT for /boot-source, /drives, /network-interfaces and /vsock
* added unit tests in the modified files to improve coverage
* the updates are meant to be *full* updates, not partial ones; partial
  updates will be implemented as PATCH operations in the future
* block device live resize is still a PUT; it will become a PUT
  /drives/operations in the future

Signed-off-by: Alexandra Ghecenco <[email protected]>
* only covering PUT updates

Signed-off-by: Alexandra Ghecenco <[email protected]>
@alxiord alxiord merged commit bfe2e64 into firecracker-microvm:master May 22, 2018
@alxiord alxiord deleted the api_updates branch May 25, 2018 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants