Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SF#390 scalar PDL with badvalue always compares BAD with perl scalars #124

Closed
zmughal opened this issue Jun 17, 2015 · 3 comments
Closed

Comments

@zmughal
Copy link
Member

zmughal commented Jun 17, 2015

From http://sourceforge.net/p/pdl/bugs/390/

This problem was reported by as user summarized in this message:
https://sourceforge.net/p/pdl/mailman/message/34211884/
as shown in this sample program:

use PDL;
use PDL::NiceSlice;

my $x = pdl(1, 2, 3, 0);
$x->badflag(1);
$x->badvalue(0);

print "x = $x\n";

my ($m, $s) = stats($x);

print "m = $m, s = $s\n";
print "s greater than zero\n" if $s > 0;
print "s less than zero\n" if $s < 0;
print "s equals zero\n" if $s == 0;

In fact, other comparisons of a scalar PDL value with the badflag set show the same error:

pdl> $s = pdl(0)

pdl> $s->badflag(1)

pdl> p $s
0

pdl> p $s>3
0

pdl> p $s->badvalue(0)
0

pdl> p $s>3
BAD

pdl> $s .= 2

pdl> p $s
2

pdl> p $s>3
BAD

pdl> p $s->badvalue(35)
35

pdl> p $s>3
0

pdl> p $s->badvalue(2)
2

pdl> p $s>3
BAD

Given the weird and not yet understood behavior, I have marked this as highest priority.

zmughal added a commit that referenced this issue Jun 21, 2015
This is to elucidate the issues with stat() and per-PDL badvalues
brought up by Marek Gierliński in SF#390
<http://sourceforge.net/p/pdl/bugs/390/>,
<#124>.
@zmughal
Copy link
Member Author

zmughal commented Jun 21, 2015

I have put together some tests on the sf390 branch:
#128.

I think I may have a direct fix for the problem regarding
stats(). The following test shows what I believe is a fundamental issue
with the logic found in stats()/statsover() [^1]:

subtest "stats() should not set the badflag for output" => sub {
    my $p = pdl [1, 2, 3];
    $p->badflag(1);
    $p->badvalue(2);

    diag "\$p = $p";
    is( "$p", "[1 BAD 3]");

    my $m = stats($p);

    is( "$m", "2", "Mean of [1 3] is 2" );
};

The PDL $p in the above stringifies to "[1 BAD 3]" since 2 is now a
badvalue. Now if I calculate the mean of $p, it should ignore the
badvalue and calculate ( 1 + 3 ) / 2 == 2.

However, this is the same as the badvalue that was set for the input
PDL. Which means that the mean $m that is calculated above actually
stringifies to "BAD" rather than the expected "2".

I feel that setting a badflag as the result of statsover() does not make
sense as that we already guarantee that all BAD values are skipped in
the documentation of statsover:

Bad values are simply ignored in the calculation

That would imply badvalues should not be propagated out of statsover()
unless $piddle->isbad->all.

[^1] https://github.com/PDLPorters/pdl/blob/sf390/t/badvalue_scalar_cmp.t#L97-108

zmughal added a commit that referenced this issue Jun 21, 2015
This is to elucidate the issues with stat() and per-PDL badvalues
brought up by Marek Gierliński in SF#390
<http://sourceforge.net/p/pdl/bugs/390/>,
<#124>.
zmughal added a commit that referenced this issue Jun 21, 2015
This is to elucidate the issues with stat() and per-PDL badvalues
brought up by Marek Gierliński in SF#390
<http://sourceforge.net/p/pdl/bugs/390/>,
<#124>.
@zmughal
Copy link
Member Author

zmughal commented Jun 27, 2015

I wanted to know if this bug was a regression or not, so I used a tool I wrote (in the devops repository on GitHub) and it appears that this bug has been around since at least 2008.

zmughal added a commit that referenced this issue Jul 25, 2015
When using a comparison operator, inconsistent results can appear when a
badvalue is either 0 or 1. This is because {0, 1} is the range of all
operators that return a logical PDL. This means that if the input PDL
contains badvalues, it can not both represent badvalues that are the
result of computing the operator *and* logical values at the same time.

This is a continuation of the work addressing the problem brought up in
<http://sourceforge.net/p/pdl/bugs/390/>, <#124>.
zmughal added a commit that referenced this issue Jul 28, 2015
This is to elucidate the issues with stat() and per-PDL badvalues
brought up by Marek Gierliński in SF#390
<http://sourceforge.net/p/pdl/bugs/390/>,
<#124>.

When testing the scalar comparison, test across several values on either
side of bad value so that it is clear which case is failing.

The test must be skipped if BADVAL_USENAN: the test sets `badvalue()`.
zmughal added a commit that referenced this issue Jul 28, 2015
When using a comparison operator, inconsistent results can appear when a
badvalue is either 0 or 1. This is because {0, 1} is the range of all
operators that return a logical PDL. This means that if the input PDL
contains badvalues, it can not both represent badvalues that are the
result of computing the operator *and* logical values at the same time.

This is a continuation of the work addressing the problem brought up in
<http://sourceforge.net/p/pdl/bugs/390/>, <#124>.
zmughal added a commit that referenced this issue Jul 28, 2015
This is to elucidate the issues with stat() and per-PDL badvalues
brought up by Marek Gierliński in SF#390
<http://sourceforge.net/p/pdl/bugs/390/>,
<#124>.

When testing the scalar comparison, test across several values on either
side of bad value so that it is clear which case is failing.

The test must be skipped if BADVAL_USENAN: the test sets `badvalue()`.
zmughal added a commit that referenced this issue Jul 28, 2015
When using a comparison operator, inconsistent results can appear when a
badvalue is either 0 or 1. This is because {0, 1} is the range of all
operators that return a logical PDL. This means that if the input PDL
contains badvalues, it can not both represent badvalues that are the
result of computing the operator *and* logical values at the same time.

This is a continuation of the work addressing the problem brought up in
<http://sourceforge.net/p/pdl/bugs/390/>, <#124>.
@zmughal zmughal added this to the PDL v2.012_01 milestone Jul 29, 2015
@zmughal
Copy link
Member Author

zmughal commented Aug 11, 2015

This will be in the next full release: PDL v2.013.

@zmughal zmughal closed this as completed Aug 11, 2015
d-lamb pushed a commit that referenced this issue Jan 19, 2019
This is to elucidate the issues with stat() and per-PDL badvalues
brought up by Marek Gierliński in SF#390
<http://sourceforge.net/p/pdl/bugs/390/>,
<#124>.

When testing the scalar comparison, test across several values on either
side of bad value so that it is clear which case is failing.

The test must be skipped if BADVAL_USENAN: the test sets `badvalue()`.
d-lamb pushed a commit that referenced this issue Jan 19, 2019
When using a comparison operator, inconsistent results can appear when a
badvalue is either 0 or 1. This is because {0, 1} is the range of all
operators that return a logical PDL. This means that if the input PDL
contains badvalues, it can not both represent badvalues that are the
result of computing the operator *and* logical values at the same time.

This is a continuation of the work addressing the problem brought up in
<http://sourceforge.net/p/pdl/bugs/390/>, <#124>.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant