Array vs Record, Dynamic Subscripts vs Static Field-CodePudding

So, I've studied programming languages using the book of Robert W. Sebesta, "Concepts of Programming Languages". There is one interesting paragraph that compares heterogeneous array to record. Apparently, the access time of an array is much slower because subscripts are dynamic, in contrast to static field names.

Question: So why is it faster? Does it have something to do with the utilization of Stack and Heap memory?

page 265, chapter 6:

Records are used when the collection of data values is heterogeneous and the different fields are not processed in the same way. Also, the fields of a record often need not be processed in a particular order. Field names are like literal, or constant, subscripts. Because they are static, they provide very efficient access to the fields. Dynamic subscripts could be used to access record fields, but it would disallow type checking and would also be slower.

CodePudding user response：

First I'd like to mention that the concept of "performance" doesn't really exist detached to from the actual hardware and low level implementation. When this book tells that a certain way of data organization is "faster" it makes a number of assumptions regarding the actual compiler/interpreter implementation and the hardware (von Neumann architecture, CPU instruction set, memory layout).

With that out of the way, let's talk about the possible differences in the implementations of the static record, homogeneous and heterogeneous array.

1. Static record

struct Point {
    int x;
    int y;
};

This is a simple C struct. When you do:

Point *p = new Point();

memory allocator allocates 8 bytes on the heap and stores the address of the start of this block in p.

Compiler knows the size of Point in advance (8 bytes), for each field of the Point compiler knows its size and shift (i.e. y has size 4 and shift 4).

When you access a field of the Point (p -> x), compiler replaces it with calculating the actual memory address of the field x (address of p shift of x) and accessing this address. No runtime overhead whatsoever.

2. Homogeneous array

int *arr = new int[10];

This is a simple homogeneous array in C . Similar to the struct, allocator allocates 4 * 10 bytes on the heap and stores its start address in arr.

When you access element of the array, arr[i], compiler knows the address of the arr, size of its elements (they are the same, as array is homogeneous), so it can calculate the memory address you're accessing as arr i * element_size (a single CPU instruction on most architectures).

Again, not much overhead, if you're not counting accessing i in addition to the arr.

3. Heterogeneous array

Now, that is where things get interesting. There is no direct low-level representation for heterogeneous array. There are multiple possible implementations that map this high-level concept into the actual memory and machine code.

Let's consider one of them.

enum ElType {
    INT, POINT
};

struct ArrEl {
    ElType type;
    char* elPtr;
};

ArrEl *arr = new ArrEl[10];

arr[1].type = POINT;
arr[1].elPtr = (char*)(new Point {3,4} );

arr[2].type = INT;
arr[2].elPtr = (char*)(new int {2} );

Notice, since the array can now store elements of any type (for this demo only int or Point):

Array elements can have different size (Point is 8 bytes, int is 4), so actual elements have to be allocated on the heap, and array stores only pointers (Alternative approach would allocate the memory equal to the size of the largest possible element for each element, and it is too wasteful in general case; note: see comments below).
To know the actual type of the stored element, additional metadata has to be stored. This approach choses to store it in the array, but most actual implementations (including python and java) store it together with the actual object, as an "object header". See the simplified implementation here.

Now to access an element of such array, this metadata has to be checked:

ArrEl el = arr[2];

if (el.type == INT) {
  int *value = (int *) el.elPtr;
  std::cout <<  *value;
} else if (el.type == POINT) {
  Point *p = (Point *) el.elPtr;
  std::cout <<  p->x;
}

The overhead of accessing heterogeneous array consists of the additional indirection: first you have to access the element of the array, and then follow the pointer to the actual value on the heap, and check the metadata.

There is also an additional memory overhead of storing all the pointers and metadata. This is especially noticeable, when you compare homogeneous arrays of "simple" types, such as int to heterogeneous arrays that can store, say int and float (note: see comments below).

Ok, now, when I've given the general idea why heterogeneous arrays can be slow in theory, let's talk about the real world.

Most modern VMs for languages that have heterogeneous arrays have JIT. Opposite to the static compilers (like C ), JIT can make some optimistic assumptions about the executed code and, if such assumptions fail, recompile code to the more pessimistic variant in runtime.

Back to arrays, although arrays in dynamic languages, such as Javascript and Python, are heterogeneous, it's possible that when array is used in a homogeneous way, JIT compiles it to be homogeneous internally! For example, V8 certainly does that. I don't think Python does that at the moment, but it might in the future.

Also, modern optimizing compilers, including static compilers, can rewrite your code in ways you wouldn't expect. For example, your code creates an object and performs some operations on its fields, but in reality compiler will replace field access with CPU registers. No "actual" object is created at all.

That is why it's always important to verify theoretical assumptions about performance with actual benchmarks.

P.S. all C code in this demo is not idiomatic, unsafe and bad. Don't use it at home.