I had a list of input something like:
A1A
B1A
C2D
B1A
D3C
B2D
...
and I wish to output a pivot table like:
1A | 3C | 2D | |
---|---|---|---|
A | 1 | ||
B | 2 | 1 | |
C | 1 | ||
D | 1 |
The list first character as row name and count the individual remaining character.
I was trying to transform the list into dic format first
input = ['A1A', 'B1A', 'C2D', 'B1A', 'D3C', 'B2D']
dict_r = {}
for id in input:
result.setdefault(id[0], []).append(id)
and I have no idea what I can do for next.
CodePudding user response:
You can use str.extract
and pandas.crosstab
:
inp = ['A1A', 'B1A', 'C2D', 'B1A', 'D3C', 'B2D']
a,b = pd.Series(inp).str.extract('(.)(. )').T.values
out = pd.crosstab(a, b).rename_axis(index=None, columns=None)
or with a simple list comprehension:
a,b = zip(*((x[0], x[1:]) for x in inp))
out = pd.crosstab(a, b).rename_axis(index=None, columns=None)
output:
1A 2D 3C
A 1 0 0
B 2 1 0
C 0 1 0
D 0 0 1
CodePudding user response:
You can try split the list item to two parts then pivot table
lst = ['A1A', 'B1A', 'C2D', 'B1A', 'D3C', 'B2D']
out = pd.DataFrame(data=[[i[0], i[1:], 0] for i in lst], columns=['index', 'column', 'value'])
res = out.pivot_table(index='index', columns='column', values='value', fill_value=0, aggfunc='count')
print(out)
index column value
0 A 1A 0
1 B 1A 0
2 C 2D 0
3 B 1A 0
4 D 3C 0
5 B 2D 0
print(res)
column 1A 2D 3C
index
A 1 0 0
B 2 1 0
C 0 1 0
D 0 0 1