No announcement yet.

sparse UTF8 byte arrays

  • Filter
  • Time
  • Show
Clear All
new posts

    sparse UTF8 byte arrays

    My application is handling a large number of strings.
    To reduce memory consumption I thought it would be a good idea to represent the strings as UTF8 encoded byte arrays:
    var sparseByteArray = Encoding.UTF8.GetBytes("some string");

    To my surprise, Running Chrome on 64 bits Win10, this had the effect of increasing the memory by a factor of four.
    The reason is that each element in the array ends up as a 64 bit number.
    This happens in spite of option useTypedArrays set to true.

    So in order to get a proper encoded array I have to repack the array:
    var byteArray=new byte[sparseByteArray.Length]; // this becomes an UInt8 array
    Array.Copy(sparseByteArray,byteArray,sparseByteArr ay.Length);
    Now the byteArray is packed properly and it is 8 times smaller than the sparseByteArray.

    Of coarse, the price to pay is CPU cycles both for copying and for garbage collection.
    It would be nice if the encoder could support UInt8 arrays directly.

    In general I find the UInt8 array to be fragile.
    As I have pointed out in another thread "Uint8 as generic type", all it takes to make it sparse is to pass it to a generic type of byte[].
    Also, initializing the array will make it become sparse:
    byteArray=new byte[]{1,2,3}; // is now represented as 64 bits numbers.

    For all I know this behaviour is as expected for experienced Brigde users.
    But for me, it is not, and I hope at least some of you members will appreciate this post.

    Regards Jens

    Hello JensTangen !

    Could you translate all that into a deck? As pointed in the thread you mentioned, the initial implementation of TypedArrays using support was still prone to missing features, and we could log this scenario as an issue in github so this could be eventually addressed.

    I'm thinking Encoding.UTF8.GetBytes() is just re-typing sparseArray as an ordinary js array as its client-side implementation's return value probably is not affected by the UseTypedArrays setting.


      Hi Fabricio
      The deck does not give me the possibility to enable typeArrays. However, here is an example that illustrates the sparse array as a result of encoding to UTF8 or simply by initializing the array, and beneat it you will find the Bridge output.

      public class Program
      // Assumption: useTypedArrays set to true
      // byteArray becomes a proper typed uint8 array
      private static byte[] byteArray =new byte[8];
      // initialization makes the array become sparse conisting of 64 bits number.
      private static byte[] sparseArray = new byte[] { 0, 1, 2, 3, 4, 5, 6, 7 };
      private static void Main()
      // encoding gives a sparse array of 64 bits number:
      var byteArrayUT8Sparse =Encoding.UTF8.GetBytes("some string");
      // now copy it to an uint8 array
      var byteArrayUTF8 = new byte[byteArrayUT8Sparse.Length];
      Array.Copy(byteArrayUT8Sparse, byteArrayUTF8, byteArrayUTF8.Length);
      // now finally byteArrayUTF8 is a proper UTF8 encoded uint8 array

      Brigde output:
      Bridge.assembly("Demo", function ($asm, globals) {
      "use strict";

      Bridge.define("Demo.Program", {
      main: function Main () {
      var byteArrayUT8Sparse = System.Text.Encoding.UTF8.GetBytes$2("some string");
      var byteArrayUTF8 = System.Array.init(new Uint8Array(byteArrayUT8Sparse.length), System.Byte);
      System.Array.copy(byteArrayUT8Sparse, 0, byteArrayUTF8, 0, byteArrayUTF8.length);
      statics: {
      fields: {
      byteArray: null,
      sparseArray: null
      ctors: {
      init: function () {
      this.byteArray = System.Array.init(new Uint8Array(8), System.Byte);
      this.sparseArray = System.Array.init([0, 1, 2, 3, 4, 5, 6, 7], System.Byte);
      Regards Jens