Given the letter(s) of an Excel column header I need to output the column number.
It goes A-Z
, then AA-AZ
then BA-BZ
and so on.
I want to go through it like it's base 26, I just don't know how to implement that.
It works fine for simple ones like AA
because 26^0 = 1
26^1 = 26
= 27
.
But with something like ZA
, if I do 26 ^ 26
(z is the 26th letter) the output is obviously too large. What am I missing?
CodePudding user response:
If we decode "A" as 0, "B" as 1, ... then "Z" is 25 and "AA" is 26.
So it is not a pure 26-base encoding, as then a prefixed "A" would have no influence on the value, and "AAAB" would have to be the same as "B", just like in the decimal system 0001 is equal to 1. But this is not the case here.
The value of "AA" is 1*261 0, and "ZA" is 26*261 0.
We can generalise and say that "A" should be valued 1, "B" 2, ...etc (with the exception of a single letter encoding). So in "AAA", the right most "A" represents a coefficient of 0, while the other "A"s represent ones: 1*262 1*261 0
This leads to the following code:
def decode(code):
val = 0
for ch in code: # base-26 decoding "plus 1"
val = val * 26 ord(ch) - ord("A") 1
return val - 1
Of course, if we want the column numbers to start with 1 instead of 0, then just replace that final statement with:
return val
CodePudding user response:
sum of powers
You can sum the multiples of the powers of 26:
def xl2int(s):
s = s.strip().upper()
return sum((ord(c)-ord('A') 1)*26**i
for i,c in enumerate(reversed(s)))
xl2int('A')
# 1
xl2int('Z')
# 26
xl2int('AA')
# 27
xl2int('ZZ')
# 702
xl2int('AAA')
# 703
int
builtin
You can use a string translation table and the int
builtin with the base
parameter.
As you have a broken base you need to add 26**n 26**(n-1) ... 26**0 for an input of length n, which you can obtain with int('11...1', base=26)
where there are as many 1s as the length of the input string.
from string import ascii_uppercase, digits
t = str.maketrans(dict(zip(ascii_uppercase, digits ascii_uppercase)))
def xl2int(s):
s = s.strip().upper().translate(t)
return int(s, base=26) int('1'*len(s), base=26)
xl2int('A')
# 1
xl2int('Z')
# 26
xl2int('AA')
# 27
xl2int('ZZ')
# 702
xl2int('AAA')
# 703
How the translation works
It shifts each character so that A -> 0, B -> 1... J -> 9, K -> A... Z -> P. Then it converts it to integer using int
. However the obtained number is incorrect as we are missing 26**x for each digit position in the number, so we add as many power of 26 as there are digits in the input.