Non-deterministic test? #6

LukeMathWalker · 2018-09-29T16:27:54Z

    #[test]
    fn test_zero_observations() {
        let a = Array2::<f32>::zeros((2, 0));
        let pearson = a.pearson_correlation();
        assert_eq!(pearson.shape(), &[2, 2]);
        let all_nan_flag = pearson.iter().map(|x| x.is_nan()).fold(true, |acc, flag| acc & flag);
        assert_eq!(all_nan_flag, true);
    }

This test fails on Travis, for #5, while it succeeds on my local machine.
It's weird - any idea? @jturner314
I am trying to reproduce the error, but I am failing.

LukeMathWalker · 2018-09-29T17:02:43Z

On Travis CI the pearson matrix is:

[[-inf, inf],
 [inf, inf]]

If I look into the covariance matrix obtained using the cov method I get:

[[0.0, 0.0],
 [0.0, 0.0]]

This is due to the dot method: with zero observations denoised is an array of shape [2, 0], and multiplying it with its own transpose gives us a matrix filled with zeroes. This seems to be coherent with dot implementation when blas is not active:

fn dot_generic<S2>(&self, rhs: &ArrayBase<S2, Ix1>) -> A
        where S2: Data<Elem=A>,
              A: LinalgScalar,
    {
        debug_assert_eq!(self.len(), rhs.len());
        assert!(self.len() == rhs.len());
        if let Some(self_s) = self.as_slice() {
            if let Some(rhs_s) = rhs.as_slice() {
                return numeric_util::unrolled_dot(self_s, rhs_s);
            }
        }
        let mut sum = A::zero();
        for i in 0..self.len() {
            unsafe {
                sum = sum.clone() + self.uget(i).clone() * rhs.uget(i).clone();
            }
        }
        sum
    }

/// Compute the dot product.
///
/// `xs` and `ys` must be the same length
pub fn unrolled_dot<A>(xs: &[A], ys: &[A]) -> A
    where A: LinalgScalar,
{
    debug_assert_eq!(xs.len(), ys.len());
    // eightfold unrolled so that floating point can be vectorized
    // (even with strict floating point accuracy semantics)
    let len = cmp::min(xs.len(), ys.len());
    let mut xs = &xs[..len];
    let mut ys = &ys[..len];
    let mut sum = A::zero();
    let (mut p0, mut p1, mut p2, mut p3,
         mut p4, mut p5, mut p6, mut p7) =
        (A::zero(), A::zero(), A::zero(), A::zero(),
         A::zero(), A::zero(), A::zero(), A::zero());
    while xs.len() >= 8 {
        p0 = p0 + xs[0] * ys[0];
        p1 = p1 + xs[1] * ys[1];
        p2 = p2 + xs[2] * ys[2];
        p3 = p3 + xs[3] * ys[3];
        p4 = p4 + xs[4] * ys[4];
        p5 = p5 + xs[5] * ys[5];
        p6 = p6 + xs[6] * ys[6];
        p7 = p7 + xs[7] * ys[7];

        xs = &xs[8..];
        ys = &ys[8..];
    }
    sum = sum + (p0 + p4);
    sum = sum + (p1 + p5);
    sum = sum + (p2 + p6);
    sum = sum + (p3 + p7);

    for i in 0..xs.len() {
        if i >= 7 { break; }
        unsafe {
            // get_unchecked is needed to avoid the bounds check
            sum = sum + xs[i] * *ys.get_unchecked(i);
        }
    }
    sum
}

Applying the same line of reasoning, std_matrix in pearson_correlation is a matrix filled with zeroes. What should I expect from the component-wise division of two zero-filled matrices? NaN? Random infs?

LukeMathWalker · 2018-09-29T17:36:08Z

I have added a test to check that the covariance matrix with zero observations is indeed a matrix filled with zeroes and once again I get different results on Travis CI vs local machine:

test correlation::cov_tests::test_covariance_zero_observations ... FAILED

    #[test]
    fn test_covariance_zero_observations() {
        let a = Array2::<f32>::zeros((2, 0));
        // Negative ddof (-1 < 0) to avoid invalid-ddof panic
        let cov = a.cov(-1.);
        assert_eq!(cov.shape(), &[2, 2]);
        assert!(cov.all_close(&Array2::<f32>::zeros((2, 2)), 1e-8));
    }

jturner314 · 2018-09-29T18:49:26Z

You've discovered a couple of bugs:

It appears that .dot() of a 2 × 0 matrix and a 0 × 2 matrix returns a 2 × 2 matrix of uninitialized memory, when it should return a 2 × 2 matrix of zeros.
The test_covariance_zero_observations test I added in Document and test .cov() for empty arrays #4 is broken in a couple of ways: cov.mapv(|x| x.is_nan()) doesn't do anything; it should have been cov.mapv(|x| assert!(x.is_nan())). But, NaN values aren't correct in this case (compare to NumPy, which gives NaNs for ddof = 0 and zeros for ddof = -1); it really should have been cov.mapv(|x| assert_eq!(x, 0.)), as you discovered.

I don't have time right now to fix these things, but I'll take a look later.

jturner314 · 2018-09-30T01:38:16Z

I think #7 should fix the nondeterministic behavior you are observing.

What should I expect from the component-wise division of two zero-filled matrices? NaN? Random infs?

You can see the behavior of floating point division by zero with this program:

fn main() {
    for a in &[-1., -0., 0., 1.] {
        for b in &[-0., 0.] {
            println!("{:?} / {:?} = {:?}", a, b, a / b);
        }
    }
}

The result is:

-1.0 / -0.0 = inf
-1.0 / 0.0 = -inf
-0.0 / -0.0 = NaN
-0.0 / 0.0 = NaN
0.0 / -0.0 = NaN
0.0 / 0.0 = NaN
1.0 / -0.0 = -inf
1.0 / 0.0 = inf

In other words, zero divided by zero is NaN, while non-zero divided by zero is ±inf (sign determined by both operands).

For the shape = (2, 0) case, the correlation coefficient should be a 2×2 array of NaNs. You can confirm this with NumPy:

>>> import numpy as np
>>> np.corrcoef(np.zeros((2, 0)))
/usr/lib/python3.7/site-packages/numpy/lib/function_base.py:356: RuntimeWarning: Mean of empty slice.
  avg = a.mean(axis)
/usr/lib/python3.7/site-packages/numpy/core/_methods.py:78: RuntimeWarning: invalid value encountered in true_divide
  ret, rcount, out=ret, casting='unsafe', subok=False)
/usr/lib/python3.7/site-packages/numpy/lib/function_base.py:2392: RuntimeWarning: Degrees of freedom <= 0 for slice
  c = cov(x, y, rowvar)
/usr/lib/python3.7/site-packages/numpy/lib/function_base.py:2326: RuntimeWarning: divide by zero encountered in true_divide
  c *= np.true_divide(1, fact)
/usr/lib/python3.7/site-packages/numpy/lib/function_base.py:2326: RuntimeWarning: invalid value encountered in multiply
  c *= np.true_divide(1, fact)
array([[nan, nan],
       [nan, nan]])

LukeMathWalker · 2018-10-01T08:11:38Z

Great - thanks for the quick fix!
It turns out I was looking at the wrong dot function 😛
I'll merge master back into my branch to finalize the PR then!

LukeMathWalker added Bug Something isn't working Help wanted Extra attention is needed labels Sep 29, 2018

LukeMathWalker closed this as completed Oct 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-deterministic test? #6

Non-deterministic test? #6

LukeMathWalker commented Sep 29, 2018 •

edited

Loading

LukeMathWalker commented Sep 29, 2018 •

edited

Loading

LukeMathWalker commented Sep 29, 2018

jturner314 commented Sep 29, 2018

jturner314 commented Sep 30, 2018

LukeMathWalker commented Oct 1, 2018

Non-deterministic test? #6

Non-deterministic test? #6

Comments

LukeMathWalker commented Sep 29, 2018 • edited Loading

LukeMathWalker commented Sep 29, 2018 • edited Loading

LukeMathWalker commented Sep 29, 2018

jturner314 commented Sep 29, 2018

jturner314 commented Sep 30, 2018

LukeMathWalker commented Oct 1, 2018

LukeMathWalker commented Sep 29, 2018 •

edited

Loading

LukeMathWalker commented Sep 29, 2018 •

edited

Loading