Skip to content

Support closures as array initializer #2092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lu-zero opened this issue Aug 1, 2017 · 13 comments
Open

Support closures as array initializer #2092

lu-zero opened this issue Aug 1, 2017 · 13 comments
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.

Comments

@lu-zero
Copy link

lu-zero commented Aug 1, 2017

Currently the way to initialize an array is a bit unwieldy:

use std::mem;

#[derive(Debug)]
struct Codebook<T> {
    x: T,
    y: T
}

fn fill_thing_array(a : &mut [[Codebook<u8>; 4]; 4]) {
    for (i, line) in a.iter_mut().enumerate() {
        for (j, el) in line.iter_mut().enumerate() {
            *el = Codebook { x: i as u8, y: j as u8};
        }
    }
}

... 

    let mut a = unsafe { mem::uninitialized() };
    fill_thing_array(&mut a);
}

Would be nicer to provide a syntax to have something along the lines of

let a = [[|x, y| [Codebook {x: i as u8, y: j as u8}]; 4]; 4];
@vks
Copy link

vks commented Aug 1, 2017

Your implementation is unsound for general closures, because the closure might panic, leaving the array in an undefined state. This is a problem, because drop will still run on it.

I recommend to give this crate a try: https://github.com/Manishearth/array-init
It does not require an extension of the language, but it only works for arrays with up to 64 elements.

@lu-zero
Copy link
Author

lu-zero commented Aug 1, 2017

How would it be different from having the following code?

fn fun(x: usize, y: usize) -> Codebook<u8> {
    if y != 0 { panic!() };
    Codebook {x: x as u8, y: y as u8}
}

...

    let c = [fun(1, 0), fun(2, 1), fun(3, 0)];
    println!("{:?}", c);

https://play.rust-lang.org/?gist=d85cfc9f6d0d3e2e4cf6ace565d226df&version=stable

functions used to initialize an array could panic and leave it partially initialized already.

The whole proposal started because if you have to compile large (think few k elements) tables you'd like to do that in the most succinct way.

@dtolnay
Copy link
Member

dtolnay commented Aug 1, 2017

functions used to initialize an array could panic and leave it partially initialized already.

Nope, if one of the array element initializers panics then only the successfully constructed elements are dropped. In your fill_thing_array example all of the uninitialized elements are dropped as well. This would go wrong if T=String for example, String's drop implementation would segfault (if you are lucky).

Here is an example that demonstrates the difference.

#[derive(Debug)]
struct NoisyDrop(u8);

impl Drop for NoisyDrop {
    fn drop(&mut self) {
        println!("dropping {:?}", self);
    }
}

#[derive(Debug)]
struct Codebook<T> {
    x: T,
    y: T,
}

fn fun(x: usize, y: usize) -> Codebook<NoisyDrop> {
    if y != 0 { panic!() };
    println!("constructing codebook with x={} y={}", x, y);
    Codebook { x: NoisyDrop(x as u8), y: NoisyDrop(y as u8) }
}

fn main() {
    // fun(2, 1) panics so only fun(1, 0) is dropped
    let c = [fun(1, 0), fun(2, 1), fun(3, 0)];
    println!("{:?}", c);
}

As you can see, fun(1, 0) is successfully constructed and then dropped after the panic.

constructing codebook with x=1 y=0
dropping NoisyDrop(1)
dropping NoisyDrop(0)
thread 'main' panicked at 'explicit panic', src/main.rs:17:16

In the other approach:

use std::mem;

#[derive(Debug)]
struct NoisyDrop(u8);

impl Drop for NoisyDrop {
    fn drop(&mut self) {
        println!("dropping {:?}", self);
    }
}

#[derive(Debug)]
struct Codebook<T> {
    x: T,
    y: T,
}

fn fill_thing_array(a: &mut [[Codebook<NoisyDrop>; 4]; 4]) {
    for (i, line) in a.iter_mut().enumerate() {
        for (j, el) in line.iter_mut().enumerate() {
            if j != 0 { panic!() };
            println!("constructing codebook with x={} y={}", i, j);
            *el = Codebook { x: NoisyDrop(i as u8), y: NoisyDrop(j as u8) };
        }
    }
}

fn main() {
    let mut a = unsafe { mem::uninitialized() };
    fill_thing_array(&mut a);
}

The output is:

constructing codebook with x=0 y=0
dropping NoisyDrop(16)
dropping NoisyDrop(4)
dropping NoisyDrop(0)
dropping NoisyDrop(0)
dropping NoisyDrop(217)
dropping NoisyDrop(50)
dropping NoisyDrop(254)
dropping NoisyDrop(127)
dropping NoisyDrop(0)
dropping NoisyDrop(0)
dropping NoisyDrop(176)
dropping NoisyDrop(178)
dropping NoisyDrop(160)
dropping NoisyDrop(6)
dropping NoisyDrop(235)
dropping NoisyDrop(85)
dropping NoisyDrop(0)
dropping NoisyDrop(0)
dropping NoisyDrop(1)
dropping NoisyDrop(0)
dropping NoisyDrop(0)
dropping NoisyDrop(0)
dropping NoisyDrop(0)
dropping NoisyDrop(0)
dropping NoisyDrop(0)
dropping NoisyDrop(0)
dropping NoisyDrop(0)
dropping NoisyDrop(32)
dropping NoisyDrop(89)
dropping NoisyDrop(50)
dropping NoisyDrop(254)
dropping NoisyDrop(127)
dropping NoisyDrop(0)
dropping NoisyDrop(0)
thread 'main' panicked at 'explicit panic', src/main.rs:21:24

Only one element was constructed but a bunch of uninitialized data was dropped.

array_init::array_init_copy or any of the other functions in the array_init crate are the safe way to handle this situation. Maybe this could be supported better by the language or standard library.

@scottmcm
Copy link
Member

scottmcm commented Aug 1, 2017

Some previous discussion on this topic: #1915 (comment)

(That RFC was closed, choosing to wait for const generics.)

@lu-zero
Copy link
Author

lu-zero commented Aug 1, 2017

That seems to make even a better case for a compact syntax and have it desugar to the unrolled initialization.

@KamilaBorowska
Copy link
Contributor

KamilaBorowska commented Aug 1, 2017

I somehow missed a reference to array-init in this thread, and for some reason created my own implementation in style of vec! macro with the syntax you proposed - https://crates.io/crates/array-macro.

This one doesn't have limitation to number of elements, but it must be a constant integer (not even constant mathematical expression) and cannot run macro inside a macro (for some reason procedural macros don't like this)

(actually cannot run macro inside a macro seems like dtolnay/proc-macro-hack#4)

@vks
Copy link

vks commented Aug 1, 2017 via email

@comex
Copy link

comex commented Aug 1, 2017

I think there should just be an impl<T> FromIterator<T> for [T; N], which would panic if the iterator yielded an incorrect number of elements. (This would be best done with const generics, of course, but AFAICT there's no reason it couldn't be added to the existing, macro-based, list of impls for array sizes 0-32.)

@KamilaBorowska
Copy link
Contributor

KamilaBorowska commented Aug 1, 2017

Or I guess impl<T> FromIterator<T> for Option<[T; N]> as it doesn't seem like a good idea to have unexpected panics like that, I don't know.

@KamilaBorowska
Copy link
Contributor

KamilaBorowska commented Aug 1, 2017

Something like that could be a possible implementation of FromIterator<T> for Option<[T; N]>. Obviously ugly, pretty sure it could better.

#![feature(untagged_unions)]

use std::iter::FromIterator;
use std::mem;
use std::ptr;

// cannot define std implementations for builtin types, so
// i define a wrapper to do so
struct Wrapper<T>(T);

#[allow(unions_with_drop_fields)]
union NoDrop<T> {
    storage: T,
}

struct ArrayVec<F, T>
where
    F: Fn(&mut T, usize),
{
    drop_elems: F,
    array: NoDrop<T>,
    length: usize,
}

impl<F, T> Drop for ArrayVec<F, T>
where
    F: Fn(&mut T, usize),
{
    fn drop(&mut self) {
        unsafe {
            (self.drop_elems)(&mut self.array.storage, self.length);
        }
    }
}

macro_rules! impl_from_iterator {
    ($($n:tt)*) => { $(
        impl<T> FromIterator<T> for Wrapper<Option<[T; $n]>> {
            fn from_iter<I>(iter: I) -> Self
            where
                I: IntoIterator<Item = T>,
            {
                unsafe {
                    // Exists so that element that overflows an array is destructed last
                    let _last;
                    let mut array_vec = ArrayVec {
                        drop_elems: |slice: &mut [T; $n], length| {
                            for item in slice.iter_mut().take(length) {
                                ptr::drop_in_place(item);
                            }
                        },
                        array: NoDrop { storage: mem::uninitialized() },
                        length: 0,
                    };
                    for elem in iter {
                        if array_vec.length == $n {
                            _last = elem;
                            return Wrapper(None);
                        }
                        ptr::write(&mut array_vec.array.storage[array_vec.length], elem);
                        array_vec.length += 1;
                    }
                    if array_vec.length != $n {
                        return Wrapper(None);
                    }
                    let storage = ptr::read(&array_vec.array.storage);
                    mem::forget(array_vec);
                    Wrapper(Some(storage))
                }
            }
        }
    )* };
}

impl_from_iterator!(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20);

#[derive(Debug)]
struct ReportDrops(i32);

impl Drop for ReportDrops {
    fn drop(&mut self) {
        println!("{}", self.0);
    }
}

fn main() {
    {
        let Wrapper(arr): Wrapper<Option<[ReportDrops; 4]>> = (0..4).map(ReportDrops).collect();
        println!("{:?}", arr);
    }
    println!("---");
    {
        let Wrapper(arr): Wrapper<Option<[ReportDrops; 4]>> = (0..3).map(ReportDrops).collect();
        println!("{:?}", arr);
    }
    println!("---");
    {
        let Wrapper(arr): Wrapper<Option<[ReportDrops; 4]>> = (0..42).map(ReportDrops).collect();
        println!("{:?}", arr);
    }
}

Output:

Some([ReportDrops(0), ReportDrops(1), ReportDrops(2), ReportDrops(3)])
0
1
2
3
---
0
1
2
None
---
0
1
2
3
4
None

@lu-zero
Copy link
Author

lu-zero commented Aug 2, 2017

@vks array initialization in certain fields of programming is quite common and would be nice to have an ergonomic way to do that that doesn't require lots of lines of code.

@vks
Copy link

vks commented Aug 2, 2017

@lu-zero I agree that it would be nice to have better support for arrays in the standard library. I'm however not convinced that we need to add new syntax. The current workarounds (each with individual shortcomings) are:

  1. Initialize everything with a default value. (The type has to be Copy.)
  2. Use array_init. (Only works for up to 64 elements.)

I think it is ergonomic and does not require lots of code:

let arr: [u32; 50] = array_init::array_init(|i| (i*i) as u32);

@lu-zero
Copy link
Author

lu-zero commented Aug 2, 2017

array_init currently does not work for large arrays and I'm not 100% sure it would work with multi-dimensional arrays.

Probably I could try to make a procedural macro and see how it behaves.

@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Dec 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests

7 participants