From slow to SIMD: A Go optimization story

>>rbanff+(OP)
> why is it significant that we slice like a[i:i+4:i+4] rather than just a[i:i+4]?

Well I had never seen that "full slice" expression syntax before. It turns out that it's important because it controls the capacity of the new slice. The capacity of the new slice is now i+4 - i.

So by using the full slice expression you get a slice of length 4 and capacity 4. Without doing this the capacity would be equal to the capacity of the original slice.

I suppose that by controlling the capacity that you eliminate the bounds check.

>>Thalia+qW
In my testing [1] that doesn't eliminate bound checks. Instead, it avoids a computation of otherwise unused `cap(a[i:i+4]) = len(a) - i` value if my reading is correct.

[1] https://go.godbolt.org/z/63n6hTGGq (original) vs. https://go.godbolt.org/z/YYPrzjxP5 (capacity not limited)

> Well I had never seen that "full slice" expression syntax before.

Go's notion of capacity is somewhat pragmatic but at the same time confusing as well. I learned the hard way that the excess capacity is always available for the sake of optimization:

    a := []int{1, 2, 3, 4, 5}
    lo, hi := a[:2], a[2:]
    lo = append(lo, 6, 7, 8)      // Oops, it tries to reuse `lo[2:5]`!
    fmt.Printf("%v %v\n", lo, hi) // Prints `[1 2 6 7 8] [6 7 8]`

While I do understand the rationale, it is too unintuitive because there is no indication of the excess capacity in this code. I would prefer `a[x:y]` to be a shorthand for `a[x:y:y]` instead. The `a[x:y:len(a)]` case is of course useful though, so maybe a different shorthand like `a[x:y:$]` can be added.

>>lifthr+Xu1
Wow that seems pretty unsafe...

In D, for example, this works as most people would expect:

    import std.stdio;
    void main() {
      int[] a = [1, 2, 3, 4, 5];
      auto lo = a[0 .. 2], hi = a[2 .. $];
      lo ~= [6,7,8]; // new allocation
      writefln("%s %s", lo, hi);  // prints [1, 2, 6, 7, 8] [3, 4, 5]
    }

You simply can't overwrite a backing array from a slice (unless you do unsafe stuff very explicitly).

zlacker