Exploring System.Collections.Immutable Part 11: Code Contracts

December 17, 2014 Leave a comment

This is part 11 of the System.Collections.Immutable section of my Exploring the .NET Core Series.

In 2008, Microsoft Research published Code Contracts, which provide a language-agnostic way to express coding assumptions in .NET programs. The assumptions take the form of pre-conditions, post-conditions, and object invariants.

Here is a simple example of code which uses Code Contracts:

using System.Diagnostics.Contracts;

public class StringUtils
{
    internal static string Append(string s1, string s2)
    {
        Contract.Requires(s1 != null);
        Contract.Requires(s2 != null);
        Contract.Ensures(Contract.Result<string>() != null);
        return s1 + s2;
    }
}

Code Contracts assertions are not limited to runtime enforcement. They may instead be enforced by compile-time static analysis. For example, it is very simple to annotate methods with Code Contracts, set up a continuous integration (CI) server to perform static analysis, and fail the build if there are any failed assertions. This gives us the best of both worlds: a guarantee our code enforces our assumptions with essentially zero runtime penalty.

By design, Code Contracts are not enforced unless the appropriate tools are configured to check them. For this reason they are usually not appropriate for parameter validation on public methods; there is still the need for traditional parameter validation. However you can combine traditional parameter validation on public methods with Code Contract-based assertions for internal methods as follows:

public class StringUtils
{
    // This is the method that external callers would use, as we can't
    // guarantee they will enforce Code Contract checking
    public static string Append(string s1, string s2)
    {
        if (s1 == null)
            throw new ArgumentNullException("s1");
        if (s2 == null)
            throw new ArgumentNullException("s2");
        // The Code Contracts static analyzer is quite clever and it
        // realizes that the AppendInternal pre-conditions are satisfied
        // due to the above two statements, so no Contract.Assert()s
        // are required.
        return AppendInternal(s1, s2);
    }

    // This is the method that other code in this assembly would use,
    // as we can guarantee that this code is checked at compile-time
    // with the Code Contracts static analyzer
    internal static string AppendInternal(string s1, string s2)
    {
        Contract.Requires(s1 != null);
        Contract.Requires(s2 != null);
        Contract.Ensures(Contract.Result<string>() != null);
        return s1 + s2;
    }
}

You can write Code Contracts assertions with just the .NET 4.5 SDK installed, but they will not be enforced. To enforce them at compile-time within Visual Studio:

  1. Install the Code Contracts for .NET extension using NuGet:
    Code Contracts for .NET
  2. Restart Visual Studio
  3. Open up the Project Properties dialog and click on the Code Contracts tab
  4. Click “Perform Static Contract Checking”
    Code Contracts Project Properties

System.Collections.Immutable uses Code Contracts only to express post-conditions and object invariants, never to validate pre-conditions. This is presumably to integrate well with client code which uses Code Contracts.

For more information on Microsoft Code Contracts, please read:

Recommendations

  1. Liberally annotate all methods with Code Contracts post-conditions and object invariants.
  2. If you can guarantee that a method will be called only by code that you control (e.g. internal methods), use Code Contracts to enforce pre-conditions. If you cannot, use traditional parameter validation.
  3. Integrate Code Contracts static code analysis into your CI pipeline, and fail the build on any warnings.

Exploring System.Collections.Immutable Part 10: Performance Tuning Enumeration

December 2, 2014 Leave a comment

This is part 10 of the System.Collections.Immutable section of my Exploring the .NET Core Series.

The .NET Core’s ImmutableArray<T> provides two enumerators. The first has been highly tuned for speed, and the second is a fallback for compatibility when it is required.

The high-performance enumerator uses the following performance optimizations:

  1. The enumerator is a struct, rather than a class, so that it is stack-allocated rather than heap-allocated.
  2. The enumerator does not implement IEnumerator or IEnumerator<T>, as this would require it to implement IDisposable. By not implementing IDisposable the iterator will inline during foreach loops..
  3. The enumerator does not use range checks in Enumerator.Current; it requires on .NET’s array range checks to throw an exception instead.

The high-performance enumerator is called ImmutableArray<T>.Enumerator, which is returned by ImmutableArray<T>.GetEnumerator():

[Pure]
public Enumerator GetEnumerator()
{
    this.ThrowNullRefIfNotInitialized();
    return new Enumerator(this.array);
}

Note that this method is not defined by any interface ImmutableArray<T> implements, but it will be used by foreach. This is because foreach is pattern-based rather than interface-based. What this means is that foreach does not require an object to implement IEnumerable as long as it implements a method named GetEnumerator(). Similarly, the returned object is not required to implement IEnumerator as long as it implements a Current property and a MoveNext() method.

The result of all this work is that foreach over an ImmutableArray is just as efficient as a hand-written for loop.

Recommendations

  1. To improve the performance of foreach, consider writing a specialized, struct-based enumerator in addition to the traditional one.

Exploring System.Collections.Immutable Part 9: Immutable Collections and the Builder Pattern

December 1, 2014 Leave a comment

This is part 9 of the System.Collections.Immutable section of my Exploring the .NET Core Series.

Using the builder pattern to allow for easier construction of immutable objects is well-known.

The .NET Core’s immutable collections assembly, System.Collections.Immutable, also uses the builder pattern, but for a slightly different reason: to improve the performance of making many changes to the collection. This is possible because, unlike the immutable collection itself, the builder pattern does not need to maintain the immutable collection’s invariants after each modification. The builder pattern merely needs to reestablish the invariants of the immutable collection upon the publishing of the results.

Consider this code:

ImmutableArray<int> array = new ImmutableArray<int>();
for (int i = 1; i <= 100; ++i)
{
    array = array.Add(i);
}
// array now contains [1..100]

On each iteration of the loop, array must be a valid ImmutableArray<int>. This will lead to a large number of reallocations and memory copies in order to maintain this requirement.

However, if the above code were replaced with this, it would be far more efficient:

ImmutableArray<int>.Builder builder = new ImmutableArray<int>().ToBuilder();
for (int i = 1; i <= 100; ++i)
{
    builder.Add(1);
}
ImmutableArray<int> array = builder.ToImmutable();
// array now contains [1..100]

This is also the pattern used by String and StringBuilder.

The .NET Core also uses a nice trick of separating ImmutableArray<int>.Builder, an inner class with a substantial amount of code, into its own file by using partial classes:

// ImmutableArray`1.cs
/// <summary>
/// A readonly array with O(1) indexable lookup time.
/// </summary>
/// <typeparam name="T">The type of element stored by the array.</typeparam>
[DebuggerDisplay("{DebuggerDisplay,nq}")]
public partial struct ImmutableArray<T> : IReadOnlyList<T>, IList<T>, IEquatable<ImmutableArray<T>>, IImmutableList<T>, IList, IImmutableArray, IStructuralComparable, IStructuralEquatable
{
    ....
}

// ImmutableArray`1+Builder.cs
public partial struct ImmutableArray<T>
{
    /// <summary>
    /// A writable array accessor that can be converted into an <see cref="ImmutableArray{T}"/>
    /// instance without allocating memory.
    /// </summary>
    [DebuggerDisplay("Count = {Count}")]
    [DebuggerTypeProxy(typeof(ImmutableArrayBuilderDebuggerProxy<>))]
    public sealed class Builder : IList<T>, IReadOnlyList<T>
    {
        ...
    }
}

Recommendations

  1. Immutable objects are often highly valuable. Consider making your objects immutable and implementing the builder pattern.

Exploring System.Collections.Immutable Part 8: NullReferenceException Performance Tricks

November 28, 2014 Leave a comment

This is part 8 of the System.Collections.Immutable section of my Exploring the .NET Core Series.

The .NET Core’s System.Collections.Immutable.ImmutableArray class implements an immutable wrapper around a normal C# managed array. This looks something like:

public struct ImmutableArray<T>
{
    internal T[] array;
    ...
}

ImmutableArray.array is lazy-initialized.

Within the ImmutableArray class, there are a number of methods which have the precondition that ImmutableArray.array must be initialized. These preconditions must be checked before the method begins processing to make sure we handle invalid states correctly.

One example is ImmutableArray.IndexOf, which could have been implemented as:

public struct ImmutableArray<T>
{
    internal T[] array;

    public int IndexOf(T item)
    {
        if (this.array == null)
            throw new NullReferenceException(); // Or maybe InvalidOperationException?
        ...
    }
}

However, the .NET Core team was more clever than that. They realized that they could avoid a (likely easily-predicted) branch and instead implemented it as follows:

public struct ImmutableArray<T>
{
    internal T[] array;

    public int IndexOf(T item)
    {
        this.ThrowNullRefIfNotInitialized();
        ...
    }

    /// <summary>
    /// Throws a null reference exception if the array field is null.
    /// </summary>
    internal void ThrowNullRefIfNotInitialized()
    {
        // Force NullReferenceException if array is null by touching its Length.
        // This way of checking has a nice property of requiring very little code
        // and not having any conditions/branches. 
        // In a faulting scenario we are relying on hardware to generate the fault.
        // And in the non-faulting scenario (most common) the check is virtually free since 
        // if we are going to do anything with the array, we will need Length anyways
        // so touching it, and potentially causing a cache miss, is not going to be an 
        // extra expense.
        var unused = this.array.Length;

        // This line is a workaround for a bug in C# compiler
        // The code in this line will not be emitted, but it will prevent incorrect 
        // optimizing away of "Length" call above in Release builds.
        // TODO: remove the workaround when building with Roslyn which does not have this bug.
        var unused2 = unused;
    }
}

Personally I’d be deathly afraid that a more-clever compiler would remove this code entirely.  I would not implement this pattern unless I had excellent automated unit test coverage for this scenario.

Recommendations

  1. While it may be possible to avoid if (x == null) checks with clever member dereferences, don’t do it; it’s too risky.

Exploring System.Collections.Immutable Part 7: Reference Versus Structural Equality

November 27, 2014 Leave a comment

This is part 7 of the System.Collections.Immutable section of my Exploring the .NET Core Series.

In the previous post, I referenced EqualityComparer<T>.Default. If T does not implement IEquatable, EqualityComparer<T>.Default will use the framework-defined Object.Equals(), which implements reference equality.

However, many times you want to compare two types for structural equality (i.e. identical content) rather than reference equality (i.e. two references point to the same instance of the class). The interface IStructuralEquatable was defined to allow a class to explicitly implement structural, rather than reference equality. Related classes include IStructuralComparable and StructuralComparisons.

IStructuralEquatable.Equals() also accepts a user-provided IEqualityComparer which will be used to compare the object’s member variables for equality.

Here’s some sample code which demonstrates its use:

// A comparer that considers double.NaN != double.NaN
public class NanComparer : IEqualityComparer
{
   public new bool Equals(object x, object y)
   {
      if (x is double)
         return (double) x == (double) y;
      else 
         return EqualityComparer<object>.Default.Equals(x, y);
   }

   public int GetHashCode(object obj)
   {
      return EqualityComparer<object>.Default.GetHashCode(obj);
   }
}

// C#'s Array implements IStructualEquatable but does not implement IEquatable
double[] array1 = { double.NaN, 1.0, 2.0 };
double[] array2 = { double.NaN, 1.0, 2.0 };

// Compare the arrays for equality using Object.Equals() (reference equality).
Console.WriteLine(array1.Equals(array2)); // outputs false

IStructuralEquatable equ = array1;

// Call IStructuralEquatable.Equals using default comparer.
// EqualityComparer<object>.Default.Equals considers double.NaN to
// be equal to itself.
Console.WriteLine(equ.Equals(array2,
    EqualityComparer<object>.Default)); // outputs true

// Call IStructuralEquatable.Equals using
// StructuralComparisons.StructuralEqualityComparer.  This falls back
// to EqualityComparer<object>.Default.Equals.
Console.WriteLine(equ.Equals(array2,
    StructuralComparisons.StructuralEqualityComparer)); // outputs true

// Call IStructuralEquatable.Equals using NanComparer.
Console.WriteLine(equ.Equals(array2,
    new NanComparer())); // outputs false because NaN != NaN

The .NET Core’s ImmutableArray class implements IStructuralEquatable:

namespace System.Collections.Immutable
{
    /// <summary>
    /// A readonly array with O(1) indexable lookup time.
    /// </summary>
    /// <typeparam name="T">The type of element stored by the array.</typeparam>
    [DebuggerDisplay("{DebuggerDisplay,nq}")]
    public partial struct ImmutableArray<T> : IReadOnlyList<T>, IList<T>,
        IEquatable<ImmutableArray<T>>, IImmutableList<T>, IList,
        IImmutableArray, IStructuralComparable, IStructuralEquatable
    {
        ...
    }
}

It is unclear to me why this is the only collection in System.Collections.Immutable to implement IStructuralEquatable.

Recommendations

  1. If a collection implements IStructuralEquatable, use IStructuralEquatable.Equals() to test for structural equality. Use StructuralComparisons.StructuralEqualityComparer for simple structural equality, or a custom IEqualityComparer otherwise.
  2. If a collection implements IStructuralComparable, use IStructuralComparable.CompareTo() to perform a structural comparison. Use StructuralComparisons.StructuralComparer for simple structural comparisons, or a custom IComparer otherwise.
  3. Consider implementing IStructuralComparable and IStructuralEquatable on custom collections.

Exploring System.Collections.Immutable Part 6: Use IEquatable for Higher-Performance Equals()

November 24, 2014 1 comment

This is part 6 of the System.Collections.Immutable section of my Exploring the .NET Core Series.

Let’s say you are writing a custom IList<T> which contains the following code:

public class MyList<T> : IList<T>
{
    private T[] array;

    ...

    public int IndexOf(T item)
    {
        for (int i = 0; i != array.Length; ++i) {
            if (this.array[i].Equals(item)) {
                return i;
            }
        }

        return -1;
    }
}

The above code uses T‘s implementation of Object.Equals(), which is defined as:

public virtual bool Equals(Object obj);

If T is a value type, it will be automatically boxed by the compiler, which has a slight performance cost. However, if you knew that T implemented IEquatable<T>, then you could avoid the boxing entirely. For example, this code would be slightly better performing than the above for value types:

public class MyList<T> : IList<T> where T : IEquatable<T>
{
    // Everything else is the same as above
}

However, it is inadvisable to require MyList to only contain objects which implement IEquatable<T>.

You can get the best of both worlds by using EqualityComparer<T>.Default to choose IEquatable<T>.Equals() when available, and Object.Equals() otherwise, as follows:

public class MyList<T> : IList<T>
{
    private T[] array;

    ...

    public int IndexOf(T item)
    {
        return IndexOf(item, EqualityComparer<T>.Default);
    }

    public int IndexOf(T item, IEqualityComparer<T> equalityComparer)
    {
        for (int i = 0; i != array.Length; ++i) {
            if (equalityComparer.Equals(this.array[i], item)) {
                return i;
            }
        }

        return -1;
    }
}

Recommendations

  1. Implement IEquatable<T> on value types for which you expect .Equals() to be called a lot, especially those you put into arrays, lists, or hash tables.

Exploring System.Collections.Immutable Part 5: Keep Indexers Trivial to Allow JIT Optimization

November 21, 2014 Leave a comment

This is part 5 of the System.Collections.Immutable section of my Exploring the .NET Core Series.

This is a simple recommendation based on observations from System.Collections.Immutable.

Recommendations

  1. Keep the implementation of an indexer as trivial as possible to allow the JIT optimization of removing array bounds checking to work. For example, don’t check if a member variable is null; just use it and allow the NullReferenceException to happen naturally.

    In other words, use:

    public T this[int index]
    {
        get
        {
            return this.array[index];
        }
    }
    

    not:

    public T this[int index]
    {
        get
        {
            if (this.array == null)
                throw new NullReferenceException();
            return this.array[index];
        }
    }
    
Follow

Get every new post delivered to your Inbox.

Join 38 other followers