Skip to main content

Curious null-coalescing operator custom implicit conversion behaviour


This question arose when writing my answer to this one , which talks about the associativity of the null-coalescing operator .



Just as a reminder, the idea of the null-coalescing operator is that an expression of the form




x ?? y



first evaluates x , then:



  • If the value of x is null, y is evaluated and that is the end result of the expression

  • If the value of x is non-null, y is not evaluated, and the value of x is the end result of the expression, after a conversion to the compile-time type of y if necessary



Now usually there's no need for a conversion, or it's just from a nullable type to a non-nullable one - usually the types are the same, or just from (say) int? to int . However, you can create your own implicit conversion operators, and those are used where necessary.



For the simple case of x ?? y , I haven't seen any odd behaviour. However, with (x ?? y) ?? z I see some confusing behaviour.



Here's a short but complete test program - the results are in the comments:




using System;

public struct A
{
public static implicit operator B(A input)
{
Console.WriteLine("A to B");
return new B();
}

public static implicit operator C(A input)
{
Console.WriteLine("A to C");
return new C();
}
}

public struct B
{
public static implicit operator C(B input)
{
Console.WriteLine("B to C");
return new C();
}
}

public struct C {}

class Test
{
static void Main()
{
A? x = new A();
B? y = new B();
C? z = new C();
C zNotNull = new C();

Console.WriteLine("First case");
// This prints
// A to B
// A to B
// B to C
C? first = (x ?? y) ?? z;

Console.WriteLine("Second case");
// This prints
// A to B
// B to C
var tmp = x ?? y;
C? second = tmp ?? z;

Console.WriteLine("Third case");
// This prints
// A to B
// B to C
C? third = (x ?? y) ?? zNotNull;
}
}



So we have three custom value types, A , B and C , with conversions from A to B, A to C, and B to C.



I can understand both the second case and the third case... but why is there an extra A to B conversion in the first case? In particular, I'd really have expected the first case and second case to be the same thing - it's just extracting an expression into a local variable, after all.



Any takers on what's going on? I'm extremely hesistant to cry "bug" when it comes to the C# compiler, but I'm stumped as to what's going on...



EDIT: Okay, here's a nastier example of what's going on, thanks to configurator's answer, which gives me further reason to think it's a bug. EDIT: The sample doesn't even need two null-coalescing operators now...




using System;

public struct A
{
public static implicit operator int(A input)
{
Console.WriteLine("A to int");
return 10;
}
}

class Test
{
static A? Foo()
{
Console.WriteLine("Foo() called");
return new A();
}

static void Main()
{
int? y = 10;

int? result = Foo() ?? y;
}
}



The output of this is:




Foo() called
Foo() called
A to int



The fact that Foo() gets called twice here is hugely surprising to me - I can't see any reason for the expression to be evaluated twice.


Source: Tips4allCCNA FINAL EXAM

Comments

  1. Thanks to everyone who contributed to analyzing this issue. It is clearly a compiler bug. It appears to only happen when there is a lifted conversion involving two nullable types on the left-hand side of the coalescing operator.

    I have not yet identified where precisely things go wrong, but at some point during the "nullable lowering" phase of compilation -- after initial analysis but before code generation -- we reduce the expression

    result = Foo() ?? y;


    from the example above to the moral equivalent of:

    A? temp = Foo();
    result = temp.HasValue ?
    new int?(A.op_implicit(Foo().Value)) :
    y;


    Clearly that is incorrect; the correct lowering is

    result = temp.HasValue ?
    new int?(A.op_implicit(temp.Value)) :
    y;


    My best guess based on my analysis so far is that the nullable optimizer is going off the rails here. We have a nullable optimizer that looks for situations where we know that a particular expression of nullable type cannot possibly be null. Consider the following naive analysis: we might first say that

    result = Foo() ?? y;


    is the same as

    A? temp = Foo();
    result = temp.HasValue ?
    (int?) temp :
    y;


    and then we might say that

    conversionResult = (int?) temp


    is the same as

    A? temp2 = temp;
    conversionResult = temp2.HasValue ?
    new int?(op_Implicit(temp2.Value)) :
    (int?) null


    But the optimizer can step in and say "whoa, wait a minute, we already checked that temp is not null; there's no need to check it for null a second time just because we are calling a lifted conversion operator". We'd them optimize it away to just

    new int?(op_Implicit(temp2.Value))


    My guess is that we are somewhere caching the fact that the optimized form of (int?)Foo() is new int?(op_implicit(Foo().Value)) but that is not actually the optimized form we want; we want the optimized form of Foo()-replaced-with-temporary-and-then-converted.

    Many bugs in the C# compiler are a result of bad caching decisions. A word to the wise: every time you cache a fact for use later, you are potentially creating an inconsistency should something relevant change. In this case the relevant thing that has changed post initial analysis is that the call to Foo() should always be realized as a fetch of a temporary.

    We did a lot of reorganization of the nullable rewriting pass in C# 3.0. The bug reproduces in C# 3.0 and 4.0 but not in C# 2.0, which means that the bug was probably my bad. Sorry!

    I'll get a bug entered into the database and we'll see if we can get this fixed up for a future version of the language. Thanks again everyone for your analysis; it was very helpful!

    ReplyDelete
  2. If you take a look at the generated code for the Left-grouped case it actually does something like this (csc /optimize-):

    C? first;
    A? atemp = a;
    B? btemp = (atemp.HasValue ? new B?(a.Value) : b);
    if (btemp.HasValue)
    {
    first = new C?((atemp.HasValue ? new B?(a.Value) : b).Value);
    }


    Another find, if you use first it will generate a shortcut if both a and b are null and return c. Yet if a or b is non-null it re-evaluates a as part of the implicit conversion to B before returning which of a or b is non-null.

    From the C# 4.0 Specification, §6.1.4:



    If the nullable conversion is from S? to T?:

    If the source value is null (HasValue property is false), the result is the null value of type T?.
    Otherwise, the conversion is evaluated as an unwrapping from S? to S, followed by the underlying conversion from S to T, followed by a wrapping (§4.1.10) from T to T?.




    This appears to explain the second unwrapping-wrapping combination.



    The C# 2008 and 2010 compiler produce very similar code, however this looks like a regression from the C# 2005 compiler (8.00.50727.4927) which generates the following code for the above:

    A? a = x;
    B? b = a.HasValue ? new B?(a.GetValueOrDefault()) : y;
    C? first = b.HasValue ? new C?(b.GetValueOrDefault()) : z;


    I wonder if this is not due to the additional magic given to the type inference system?

    ReplyDelete
  3. This is most definitely a bug.

    public class Program {
    static A? X() {
    Console.WriteLine("X()");
    return new A();
    }
    static B? Y() {
    Console.WriteLine("Y()");
    return new B();
    }
    static C? Z() {
    Console.WriteLine("Z()");
    return new C();
    }

    public static void Main() {
    C? test = (X() ?? Y()) ?? Z();
    }
    }


    This code will output:

    X()
    X()
    A to B (0)
    X()
    X()
    A to B (0)
    B to C (0)


    That made me think that the first part of each ?? coalesce expression is evaluated twice.
    This code proved it:

    B? test= (X() ?? Y());


    outputs:

    X()
    X()
    A to B (0)


    This seems to happen only when the expression requires a conversion between two nullable types; I've tried various permutations with one of the sides being a string, and none of them caused this behaviour.

    ReplyDelete
  4. Console.WriteLine("First case");
    A? a2 = a;
    B? b2 = a2.HasValue ? new B?(a.Value) : b;
    if (b2.HasValue)
    {
    a2 = a;
    B? b3 = a2.HasValue ? new B?(a.Value) : b;
    new C?(b3.Value);
    }
    Console.WriteLine("Second case");
    a2 = a;
    B? b4 = a2.HasValue ? new B?(a.Value) : b;
    b2 = b4;
    C? arg_FB_0 = b2.HasValue ? new C?(b4.Value) : c;
    Console.WriteLine("Third case");
    a2 = a;
    b2 = (a2.HasValue ? new B?(a.Value) : b);
    C? c3 = new C?(b2.HasValue ? b2.GetValueOrDefault() : c2);


    Answer is in decompiled code.
    It is evaluating first expression twice.
    I don't see any reason to re-evaluate the expression again.
    I'd call it a bug.

    ReplyDelete
  5. Actually, I'll call this a bug now, with the clearer example. This still holds, but the double-evaluation is certainly not good.

    It seems as though A ?? B is implemented as A.HasValue ? A : B. In this case, there's a lot of casting too (following the regular casting for the ternary ?: operator). But if you ignore all that, then this makes sense based on how it's implemented:


    A ?? B expands to A.HasValue ? A : B
    A is our x ?? y. Expand to x.HasValue : x ? y
    replace all occurrences of A -> (x.HasValue : x ? y).HasValue ? (x.HasValue : x ? y) : B


    Here you can see that x.HasValue is checked twice, and if x ?? y requires casting, x will be cast twice.

    I'd put it down simply as an artifact of how ?? is implemented, rather than a compiler bug. Take-Away: Don't create implicit casting operators with side effects.

    It seems to be a compiler bug revolving around how ?? is implemented. Take-away: don't nest coalescing expressions with side-effects.

    ReplyDelete
  6. I am not a C# expert at all as you can see from my question history, but, I tried this out and I think it is a bug.... but as a newbie, I have to say that I do not understand everything going on here so I will delete my answer if I am way off.

    I have come to this bug conclusion by making a different version of your program which deals with the same scenario, but much less complicated.

    I am using three null integer properties with backing stores. I set each to 4 and then run int? something2 = (A ?? B) ?? C;

    (Full code here)

    This just reads the A and nothing else.

    This statement to me looks like to me it should:


    Start in the brackets, look at A, return A and finish if A is not null.
    If A was null, evaluate B, finish if B is not null
    If A and B were null, evaluate C.


    So, as A is not null, it only looks at A and finishes.

    In your example, putting a breakpoint at the First Case shows that x, y and z are all not null and therefore, I would expect them to be treated the same as my less complex example.... but I fear I am too much of a C# newbie and have missed the point of this question completely!

    ReplyDelete

Post a Comment

Popular posts from this blog

[韓日関係] 首相含む大幅な内閣改造の可能性…早ければ来月10日ごろ=韓国

div not scrolling properly with slimScroll plugin

I am using the slimScroll plugin for jQuery by Piotr Rochala Which is a great plugin for nice scrollbars on most browsers but I am stuck because I am using it for a chat box and whenever the user appends new text to the boxit does scroll using the .scrollTop() method however the plugin's scrollbar doesnt scroll with it and when the user wants to look though the chat history it will start scrolling from near the top. I have made a quick demo of my situation http://jsfiddle.net/DY9CT/2/ Does anyone know how to solve this problem?

Why does this javascript based printing cause Safari to refresh the page?

The page I am working on has a javascript function executed to print parts of the page. For some reason, printing in Safari, causes the window to somehow update. I say somehow, because it does not really refresh as in reload the page, but rather it starts the "rendering" of the page from start, i.e. scroll to top, flash animations start from 0, and so forth. The effect is reproduced by this fiddle: http://jsfiddle.net/fYmnB/ Clicking the print button and finishing or cancelling a print in Safari causes the screen to "go white" for a sec, which in my real website manifests itself as something "like" a reload. While running print button with, let's say, Firefox, just opens and closes the print dialogue without affecting the fiddle page in any way. Is there something with my way of calling the browsers print method that causes this, or how can it be explained - and preferably, avoided? P.S.: On my real site the same occurs with Chrome. In the ex