Skip to content

[Design Bug] C# requires insertion of *hidden* explicit conversions instead of reporting type errors #1528

@Nigel-Ecma

Description

@Nigel-Ecma

Summary

C# has a significant/dangerous design bug – under certain circumstance it inserts hidden conversions that potentially perform invalid mathematics. These hidden UXBs in the code are at very best a violation of the principle of minimum surprise; at worst potentially harmful. If that sounds a bit hyperbolic have a read of “Digital Woes” by Lauren Ruth Weiner and you might decide it is closer to an understatement.

The design bug is fixable, with a well known mechanism – compiler proffered code fixes (“fix-its”) – to assist users in updating their code. While there might be some users relying on the bad math (or so we have been told), who may object to it being fixed, there may also be users pleased to be made aware of the UXBs in their code.

This issue came to the attention of the csharpstandard team after postings on StackOverflow, csharplang & roslyn going back into at least 2024.

This issue is being cross-posted to both csharpstandard and csharplang – full issue in the former only, reference in the latter, it is suggested that any comments are made in the former. (Note: Which received the full issue was a coin flip, do not read anything in to.)

csharplang link: dotnet/csharplang#9917

Disclaimer: This issue submission is not being raised on behalf of csharpstandard, csharplang or roslyn teams per se. Direct any brickbats (or even bouquets) to the author.


The Issue

Consider the following code, a minor variation of code posted by @myblindy (see Credits). It is ready to copy & paste into your favourite IDE and executed:

   struct UnixTime
   {
      public static explicit operator UnixTime(int i)
      {
         System.Console.WriteLine($"UnixTime({i})");
         return default;
      }
   }
   
   class C
   {
      public UnixTime M(long i) => (UnixTime)i;
   }
   
   class Program
   {
      static void Main(string[] args)
      {
         var c = new C();
         var a = 8_589_934_590L;
         System.Console.WriteLine($"M({a})");
         c.M(a);
      }
   }

This passes the number 8,589,934,590 to M, which in turn passes it unaltered to a user-defined explicit conversion operator UnixTime. Running this through Roslyn produces:

   M(8589934590)
   UnixTime(-2)

Hmmm… 8589934590 ≠ −2!

Note: If the number used here seems overly large to put it in perspective consider that if it was US$ it is less than many billionaire fortunes, or that in Korean Won it is less than US$6M. If your C# software wipes out someone’s fortune then it has indeed caused harm (depending on your politics 😉).


Analysis

@jnm2 analysed the Standard (see Credits) to determine whether this is the defined behaviour, and determined it is! So this is a language bug, not the implementation per se as it is following the Standard.

Background: Conversions in C#

C# provides implicit & explicit conversions. Some of these are standard conversions, others user-defined. In summary:

  • Implicit conversions (§10.2):
    • are used automatically by the language at defined points;
    • hence not visible in the code;
    • predefined conversions always succeed and never cause exceptions to be thrown (§10.2.1).

  • Explicit conversions (§10.3):
    • are indicated by cast expressions (§12.9.7);
    • hence visible in the source;
    • include all the implicit conversions;
    • predefined conversions may not succeed:
      • usually throw an exception on failure;
      • numeric conversions behaviour on failure depends on the checking context (§12.8.19):
        • if the context is checked they throw;
        • if the context is unchecked the “value is truncated by discarding its ‘extra’ most significant bits” (§10.3.2) and succeed. I.e. they perform potentially bad math and continue silently – but the software writer must explicitly insert the conversion.

          Note: If you are curious as to why C# might resort to bad math as the default then maybe this will point you in the right direction.

  • Standard conversions (§10.4):
    • Standard implicit conversions (§10.4.2): A subset of the predefined implicit conversions, so all user-defined implicit conversions are excluded.
    • Standard explicit conversions (§10.4.3): All the standard implicit conversions and “the subset of predefined explicit conversions for which the opposite standard implicit conversion exists.”

      For Example: The standard explicit conversions include the implicit conversion from int to long (always safe) and the “opposite” explicit conversion from long to int (generally unsafe and may throw or silently do bad math depending on the checking context).

  • User-defined conversions (§10.5)
    • All user-defined conversions may, if required, be preceded and/or followed by a standard conversion chosen according the the language semantics (§10.5.3).
    • For implicit user-defined conversions any pre/post conversions are standard implicit conversions. All of the 1 to 3 conversions are safe and not visible in the code.
    • For explicit user-defined conversions any pre/post conversions are standard explicit conversions. All of the 1 to 3 conversions may be unsafe, only the original user-defined conversion is visible in the code.

What went wrong in the reported case?

The code they presented:

  • Declares an explicit conversion from int to UnixTime (one might guess what the latter represents but it is irrelevant to the issue).
    • This conversion generates no errors (as presented) so falls into the “conversions across domains of types sufficiently different to merit explicit notation” purpose of explicit conversions (§10.3.1)
  • They invoked this conversion passing a long value, in error, rather than an int
  • They quite reasonably questioned why the error did not produce a compile-time type error, C# is after a type safe language.
  • What they received in lieu of the compile time error was a run-time silent math error inserted by C#!

Swapping a completely reasonable expectation of a compile-time error from a “type safe” C#, to a silent run-time bad math error is an egregious failure on the part of C# – a design bug.


The bug stems from the C# design decision to use standard explicit conversions for the pre/post wrapping of user-defined explicit conversions and thereby introduce the hidden (in the source) possibility of bad math which:

  • being hidden is harder to diagnose; and
  • has unknown, and possibly serious, consequences if not discovered before shipping.

This bug has left an unknown number of UXBs in old code, and will continue to place them in new code.


We have heard the argument that these UXBs have been dropping for some 20-odd years in C# code, and it is not known that anybody has died/been injured/had their fortune wiped out/etc., so it is clearly harmless/of no import/etc.

In the physical world WWII UXBs are still occasionally found. When they are the authorities don’t react by saying “Well it has been there decades and killed no-one, put up a sign and leave it”. No, they call in bomb disposal to make it safe.


Is this just a case of the programmer failing to do overflow handling and not a language issue?

C# either ignores overflow (default, unchecked context) or raises an exception (checked context); but generally who to handle overflow is down to the programmer and there are various ways to test for it regardless of checking context. So is this just a case of the programmer not handing overflow correctly?

We would argue no.

In the code presented the programmer error was providing a value of the wrong type, there are no numeric operations per se, and type errors such as this are what type-safe languages should protect against – and that was the perfectly valid expectation of the programmer here.

The fault is with the language, which inserted a hidden explicit conversion which results in an overflow when large values fail to be in range of a smaller type, rather than reporting a simple compile-time type error.

Can users protect against or mitigate the bug?

Once diagnosed in code, itself potentially non-trivial as the insertion is hidden, the bug may not be easy/convenient to mitigate. It might be better to preemptively protect by simply avoiding using explicit user-defined conversions…

Using an implicit conversion instead requires triggering one without being explicit, e.g. for the reported example:

   ...
   public static implicit operator UnixTime(int i) { ... }
   ...
   public UnixTime M(long i) { UnixTime result = i; return result; }
   ...

This produces the expected type error on the assignment result = i and comes with the possible advantage (common to all implicit user-defined conversions) of the language adding safe pre/post standard conversions if needed. However this hardly compares for clarity and convenience with using a cast expression.

Using a method instead is another option:

   ...
   public static UnixTime FromInt(int i) { ... }
   ...
   public UnixTime M(long i) { return UnixTime.FromInt(i); }
   ...

This produces the expected type error on UnixTime.FromInt(i) and the language can still insert safe implicit conversions if needed. It is much closer to the convenience and clarity of the cast expression.


Possible Solutions

The following are suggestions, you dear reader may have others.

A: Fix It – It is a bug

The obvious solution first, simply fix it.

Instead of inserting hidden explicit standard conversions as pre/post wrapping for explicit user-defined conversions follow the implicit case and insert implicit standard conversions.

Doing this would not reduce language functionality in any way – explicit pre/post conversions could be inserted by the user where needed.

  • For the Roslyn/VS teams the impact might be mainly around fix-its – rather than just issue type errors, where previously a hidden unsafe conversion would have been inserted, fix-its offering to insert an explicit conversion, or a checked explicit conversion, might be offered. Whole project updates could also be offered as some IDEs do in similar situations.

  • For the Standard team the work would primarily be a cut’n’paste of text from the implicit user-defined section to the explicit user-defined section.

  • For the Language team, apart from agreeing the semantics, the job of suitably publicising the change is probably in their court.

  • For Users with existing code the impact would be:

    • If they relied on the bad math then inserting an explicit cast to make it obvious; or
    • If it reveals a hidden UXB then fixing the type mismatch in their code

B: Document It – It is a feature

Note: This option is taken to include other reasons not to fix it, e.g. “[nobody has died/lost their fortune/etc. yet] it is not serious enough to fix” (apologies we’re stuck for a good sounding reason).

This is the approach if the bad math is deemed a “feature” of C# rather than a bug. If that is the decision then it still needs to be documented – both for those who wish to exploit it and those [smart folk] who would rather avoid it!

  • The Roslyn/VS teams could consider: fix-its to make things explicit; or issues warnings that the code might produce mathematically invalid results; but it may not be required to do anything.

  • The Standard team would insert an informative Note altering readers to the potential for a UXB – wording might be a challenge of course but presumably the “it's a feature” supporters could provide the selling points…

  • The Language team would publicise the issue, not everybody reads the Standard!

  • Users if they wish can just ignore the warnings, or they can address the type error to remove the UXBs.


Note: It has been suggested that A and/or B might be a use case for warning waves. It is not clear to us what the end scenario for this would be.


C: Do Nothing – Not Viable

All teams (Roslyn/VS, Language, Standard) know the bug is there.

Because the conversion is invisible it breaks the C# design – an “implicit“ conversion is never dangerous; breaks the type system – invalid conversion is passed; breaks math; and breaks the Principle of minmum surprise.

The argument has been floated that as it hasn’t knowingly caused harm yet nothing needs to be done. This is ambulance at the bottom of the cliff thinking, which is both bad, and ignores it could cause harm tomorrow… Further the referenced thread was raised by users after being bitten, only their own checking found the bug, so harm at some level has already been caused.

If you’re still not convinced this option is not viable we have two words for you Ford Pinto.

So the only viable question is how to address it, not whether to ignore it. You may come up with other good options other than A & B; but C just doesn’t cut it.

Credits

The origin of this particular report of the bug was a StackOverflow question (see here) posted by @RioMcBoo, which led to @myblindy raising this issue on csharplang and posting the code the sample above is based on. This was subsequently moved to this disscussion also on csharplang. More recently @jnm2 did an analysis of the Standard which located the source of the bug there.

Note: The above are credits for the precursor work this submission is based on, if you have any brickbats aim them at the author and not at these people.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions