Home > database >  Why does a List use almost 3x the memory of an array?
Why does a List use almost 3x the memory of an array?

Time:11-26

I'm trying to add a lot of data to a List, but it seems to use a lot more RAM than an array does. I was wondering why that is and if there's a better solution.

This solution with an array takes about 78 MB of RAM. Makes sense since 4 byte * 20000000 ~= 76 MB:

float[] arrayValues = new float[20000000];

for (int i = 0; i < 20000000; i  )
     arrayValues[i] = i;

But this solution with a list takes 206 MB (!!):

List<float> listValues = new();

for (int i = 0; i < 20000000; i  )
     listValues.Add(i);

How can that be? It's basically doing the same thing - saving 20000000 float values. Where is that additional 128 MB coming from? And is there a better way that doesn't produce so much overhead?

CodePudding user response:

When you Add new items into a List<T> it has to do memory reallocation to have enough space for these new items. Let's have a look at the process:

  List<float> listValues = new();

  int capacity = listValues.Capacity;

  for (int i = 0; i < 20000000; i  ) {
    listValues.Add(i);

    if (capacity != listValues.Capacity) {
      capacity = listValues.Capacity;

      Console.WriteLine(capacity);
    }
  }

Outcome:

4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432 // <- Finally, list allocates memory for 33554432 items

As you can see, now 33554432 items are allocated and 4 8 16 ... 16777216 are garbage. In the worst case we have 33554432 allocated items and 33554432 garbage items; in total 33554432 33554432 = 67108864 ~ 3 * 20000000 and you can see this 3 factor.

What can you do?

Specify the Capacity in order to avoid realloaction (typical solution):

  // We can avoid all this mess with reallocations
  // by specifing required capacity: 20000000 items in our case 
  List<float> listValues = new(20000000);

  for (int i = 0; i < 20000000; i  ) {
    listValues.Add(i);
  }

Collect all garbage before measuring:

  // Business as usual
  List<float> listValues = new();

  for (int i = 0; i < 20000000; i  ) {
    listValues.Add(i);
  }

  // Collect garbage to measure real List efficency:
  // List allocates 33554432 items vs. 20000000 in case of array 
  // About ~70% overhead 
  GC.Collect(2);
  • Related