If I know how many of a collection to create, should I create them all or create them on the fly by-CodePudding

I need to create an array of several hundred objects. MyObject has a large memory footprint, takes as bit of effort/time to create and will be created one at a time. I know how many when I start. I can create an array MyObject[n] sized appropriately then insert the MyObject via an index. Or I can create an ArrayList<MyObject> then do an add(MyObject). The code size and structure is similar but I can imagine the second method 'could' fragment the memory differently than the first method. (Really?) I suppose the difference is pretty small, especially considering there are less than 1000 entries, but my guess is that the first method is better as it uses information I have earlier in the process. Does the ArrayList.add() have to do extra work to increase its size?

CodePudding user response：

If you check the implementation of add(E e) inside the class ArrayList, you can see this:

public boolean add(E e) {
    modCount  ;
    add(e, elementData, size); //check below
    return true;
}

private void add(E e, Object[] elementData, int s) {
    if (s == elementData.length)
        elementData = grow();
    elementData[s] = e;
    size = s   1;
}

So basically, the only time when ArrayList.add() does something more than simply setting an element inside the E[] array that it holds, is when s == elementData.length.

If you check what these fields are:

elementData: "The array buffer into which the elements of the ArrayList are stored. The capacity of the ArrayList is the length of this array buffer"
size: "The size of the ArrayList (the number of elements it contains)."

So, if you check the constructor public ArrayList(int initialCapacity), you can see that when you provide an initial capacity, the array elementData will be initialized with the capacity you provide:

if (initialCapacity > 0) {
    this.elementData = new Object[initialCapacity];

Which, in other words, means that the if (s == elementData.length) condition will never be met if you know in advance the number of elements that you will store in the array, and so the elementData = grow() will never be called making the usage of .add() method basically equivalent to you manually setting each element inside the array by index.

To sum up, if you know that you will store 850 elements in your list, then doing this:

List<Element> elements = new ArrayList<>(850);
elements.add(element1);
elements.add(element2);
//...
elements.add(element850);

... is practically equivalent, operations-wise, than doing the set into an array yourself (there is one if-check and one increment more but that's really irrelevant). With the advantage though that you have a structure such as List that makes the object much easier to use (you'll be able to use iterators and all other features provided by List interface).

So especially when the number of elements is so small (< 1000) and that you know how many they are in advance, I wouldn't overthink it and go directly into an ArrayList.

CodePudding user response：

elementData = grow() is super crucial. ArrayList uses the double-and-copy strategy to amortize the cost of insertion. When it hit's the length of it's internal buffer it creates a new array of double the size of the last and copies in the old data. So, add(x) amortizes to O(1) (constant) because over the course of enough adds, it averages to a constant insertion time.

I mean, think about this logically:

Why on earth offer an initialSize in the constructor if it's of no consequence?
How can it continue to just accept data and behave like an array?

But, he is right: unless you're working with a huge list and perpetually updating it, its of no consequential effect