Managing General Purpose Registers AVR-CodePudding

I've been learning AVR Assembly on the ATtiny10 and have been struggling with how to think about memory management, when should I use General Purpose Registers (GPRs) vs adding things into SRAM?

Also how should I use GPRs in code to avoid clashes, for example if I'm using a GPR to do something but then an interrupt gets called that uses the same register will that cause bugs? How should I avoid that? should i reserve a few registers purely for interupts or is there another effective strategy?

CodePudding user response：

When should i use GPRs vs SRAM?

Imagine your desk, which you sit at when you are working on something. It may be an electronic circuit, a jigsaw puzzle or maybe a LEGO set that you're building. But you are working on something made out of parts and you have a series of operations that you need to perform to finish your work.

While organizing one's desk often comes down to personal preference, i feel one sensible way of doing it would be to have all the parts and tools scattered around your desk while leaving an empty area close to you to perform the actual operations. The tools and parts may be grouped by function, size, similarity or other property - it is a good idea. But it is important that they do not make it harder for you to work with other elements when they are not needed. So you leave the empty space close and put them away further on the desk. As long as you can reach and grab them once they are needed it is okay, even though fetching something like that requires some work and time - you need to lean forward, reach out with your hand, maybe even stand up a little.

A random picture i found on the internet that explains what i mean:

Now if you have an operation to perform and you need two parts from your desk, you can just fetch them and put them in your empty working area. Since it is empty and close to you it is easy to connect the two LEGO blocks, solder a resistor to a PCB or connect two puzzle pieces together, whatever it is that you are doing. This organization of your desk allows you to be efficient and makes your life easier.

In this scenario, the working area is your ALU along with the registers, and the area where you put all your tools and components is the SRAM. You can easily reach and grab anything from SRAM at any time, and it is very spacious so you can fit a lot of things in here, but it would be either very hard or outright impossible to do any work directly in here. Additionally, each fetch from memory does take some time.

Your registers on the other hand are very close to you and it is very easy to perform work in here, and the data is very close to the ALU (your hands which perform the work). At the same time it is also a much smaller area so you can't really put a lot of components in there without defeating the whole purpose of it.

The same workflow applies here. You have all the data that your program needs stored in SRAM. Whenever you need to perform some operations, you fetch only parts of the data that you need and perform the operations directly in the registers. Afterwards, the result becomes another piece of data which you can decide to store in one of the registers, or back into the memory, depending on your circumstances.

If you are running a small algorithm you may be fine with putting everything into the registers. Following the desk allegory: are you soldering a 10-element electronic circuit that blinks LEDs at an interval? You can probably put everything in your working area and work freely despite that. Are you soldering a PCB with a separate microcontroller and various components made out of even more components, with like 50 pieces? Yeah, you will not fit everything in your working area without impeding your work, better use the shelves on your desk.

The same applies to the CPU: are you calculating a sum of an array that has 4 elements? You can put all the elements in the GPRs, no problem. What if your array has 10 elements? 30? 100? At some point you can't fit it all in the registers, so it is better to just store the array in memory and fetch only the parts of array you need at the time. At the same time, the accumulator which stores the partial sum of all previously processed elements is always needed for the next operation, so you are better off keeping it in a register.

In the end the way you use your registers and SRAM is entirely up to you and the work that you are performing on the data. But your decision is not completely random, instead it is a trade-off. Some data is more important at a particular moment in time than other data and it is up to you to decide which one to keep in the register, and which to put further on your desk, for later, and which one to maybe even ditch completely. The registers can store smaller amount of data but they can perform actual work on that data so store in them what you need at that point in time. The memory can store everything else, but you need some time to fetch the data from memory to the registers to perform work.

Now this is all true in other architectures as well, not just AVR. On CISC architectures like x86 instructions which perform work, such as ADD can accept memory operands. Does this mean that it performs the addition in memory? No, it only means that both the fetch and the addition are encoded in a single instruction. The addition still has to be performed in the ALU, so the fetch still has to happen - it's just not explicitly written by the programmer in the assembly code in a separate instruction. On the other and, the AVR version of ADD (which you can find in this pdf, section 6.) accepts only register operands. This is often the case with RISC architectures. You need to explicitly load your data into registers before performing work on it (such as ADDition) using a dedicated instruction (for example LD instruction on AVR). Both on CISC and RISC architectures the fetch does still happen, with all the same drawbacks.

I know you probably expected a more specific answer, but the truth is that it is entirely up to you as a programmer how will you implement the algorithm that you are working on. However, in your decisions you are guided by the constraints of your platform as well as on the algorithm itself. In the end you want to use the registers to the fullest and perform as much work with the data you already fetched as you can, while minimizing the total number of loads and stores from and to the SRAM. Hope that explanation and the desk allegory help you and good luck learning.

How to avoid GRPs being trashed?

When a function is called and uses some of the GRPs, an interrupt might happen, calling the interrupt vector. Now, the interrupt vector might also need the same GPRs, how to avoid the previous values being "trashed"? If you have a working stack (you should! it's very helpful and does not cost anything if you are not using it) you can begin your interrupt handler by PUSHing all the registers that it plans on using onto the stack, and then POPping them back into the registers before the interrupt handler returns.

Similar technique is often used by other platforms when a function is called - there are usually caller saved registers and callee saved registers (related SO answer here). The difference with interrupts is that there is no "caller", as the function is interrupted and has no control over the fact that the interrupt handler is being called, so all the registers are callee saved - callee being the interrupt handler. And the most handy way to save them is to push them onto the stack and later pop them from it.

Now the drawback here is that the interrupt handler should be fast and that the stack on which you save the registers has limited space. Rapid interrupts might start filling the stack too quickly and overlapping in time if each one takes too much time pushing and popping values. Because of this, you might consider designating and saving only part of the GPRs for use in interrupt handlers, so that they are not allowed AT ALL to modify other registers. Another smart move is to only save the registers that the interrupt is actually modifying. If the interrupt handler only requires 2-3 GPRs then pushing and popping them should not take much space, nor time. The specifics are, again, up to you to decide and depend entirely on your use case. In the end though the stack is your friend :)

Small EDIT

I assumed you will be mostly using code written by yourself in the beginnings, which is perfectly fine. Take your time learning pure assembly without any libraries, this is where all the fun is really - seeing the CPU at a very low level. Before you try any libraries you can consider trying to implement functionality present in the library by yourself. But then at some point in the future you will probably want to use a library. If you use such library you will have to conform to its ABI - Application Binary Interface. This is the specification of how functions inside of the library are to be called, including which registers must be saved and restored by the interrupt routine.

One of the first libraries that you will come across will most likely be avr-libc, the libc equivalent for the AVR platform. Here you can find its ABI and calling conventions. Do not hurry though, and learn using assembly and your code only for as long as you'd like. Just thought that is some info that you will most likely need in the future.

CodePudding user response：

I use GPRs in this manner

.def    al  = r16
.def    ah  = r17
.def    bl  = r18
.def    bh  = r19
.def    cl  = r20
.def    ch  = r21
.def    dl  = r22
.def    dh  = r23
.def    wl  = r24
.def    wh  = r25

.def    XH  = r27
.def    XL  = r26
.def    YH  = r29
.def    YL  = r28
.def    ZH  = r31
.def    ZL  = r30

This mapping is very close to 8086 registers which I use before AVR. WH:WL use as loop counter and XH:XL, YH:YL, ZH:ZL as pointers.

R0 and R1 use as temp registers. Others registers R2 up to R15 I use as atomic bit status variables or very quick global variables depend on application. All others app data is located in SRAM.

This register definition allows some useful macros for word manipulation. For example:

.macro  ldiw
    ldi @0l, low(@1)
    ldi @0h,high(@1)    
.endm

.macro  stsw
    sts @0,@1l
    sts @0 1,@1h
.endm

.macro  ldsw
    lds @0l,@1
    lds @0h,@1 1
.endm

In interrupt handlers I preserve used registers to stack. Don't forget to save SREG too; having flags changed between a compare and a branch would break the code that got interrupted.