16bit per channel Normals
Reduce normal accuracy to 16 bits per channel, either half precision float or perhaps a short. This will half the currently expensive reading of normals from memory (32 bytes per read), something performed very frequently by the MLS smoothing.
This may have a significantly negative impact on visual quality (seems unlikely to be that bad), so considered an experiment.
Edited by Nicolas Pope