Pivot a list with first letter as row and remained as column-CodePudding

I had a list of input something like:

A1A
B1A
C2D
B1A
D3C
B2D
...

and I wish to output a pivot table like:

	1A	3C	2D
A	1
B	2		1
C			1
D		1

The list first character as row name and count the individual remaining character.

I was trying to transform the list into dic format first

input = ['A1A', 'B1A', 'C2D', 'B1A', 'D3C', 'B2D']

dict_r = {}
for id in input:
  result.setdefault(id[0], []).append(id)

and I have no idea what I can do for next.

CodePudding user response：

You can use str.extract and pandas.crosstab:

inp = ['A1A', 'B1A', 'C2D', 'B1A', 'D3C', 'B2D']

a,b = pd.Series(inp).str.extract('(.)(. )').T.values

out = pd.crosstab(a, b).rename_axis(index=None, columns=None)

or with a simple list comprehension:

a,b = zip(*((x[0], x[1:]) for x in inp))

out = pd.crosstab(a, b).rename_axis(index=None, columns=None)

output:

   1A  2D  3C
A   1   0   0
B   2   1   0
C   0   1   0
D   0   0   1

CodePudding user response：

You can try split the list item to two parts then pivot table

lst = ['A1A', 'B1A', 'C2D', 'B1A', 'D3C', 'B2D']

out = pd.DataFrame(data=[[i[0], i[1:], 0] for i in lst], columns=['index', 'column', 'value'])

res = out.pivot_table(index='index', columns='column', values='value', fill_value=0, aggfunc='count')

print(out)

  index column  value
0     A     1A      0
1     B     1A      0
2     C     2D      0
3     B     1A      0
4     D     3C      0
5     B     2D      0

print(res)

column  1A  2D  3C
index
A        1   0   0
B        2   1   0
C        0   1   0
D        0   0   1