I have a column in a Pandas dataframe that looks like this
Date ColDat
===================================
2022-02-10 [2,6,3,8,2,8,8,3]
2022-02-11 [1,4,9,3,8,3,1,9]
2022-02-12 [1,6,2,7,8,3,4,0]
where ColDat
is a list of integers.
I don't want to have to hard code the mapping of each entry of the list into a different column. I did this for the time being, but it's probably not best-practice. Is there a way to iterate and automatically assign iterative column names to have a result like the following?
Date C1 C2 C3 C4 ... C8
===================================
2022-02-10 2 6 3 8 ... 3
2022-02-11 1 4 9 3 ... 9
2022-02-12 1 6 2 7 ... 0
CodePudding user response:
d = {'date': ['2022-02-10', '2022-02-11', '2022-02-12'],
'coldat': [[2,6,3,8,2,8,8,3],[1,4,9,3,8,3,1,9],[1,6,2,7,8,3,4,0]]
}
df = pd.DataFrame(data=d)
new_df = df.coldat.apply(pd.Series)
new_df.columns = ["C" str(x 1) for x in range(len(new_df.columns))]
new_df.index = df.date
CodePudding user response:
Here is another way:
(df[['date']].join(pd.DataFrame(df['coldat'].tolist())
.rename(lambda x: 'C{}'.format(x 1),axis=1)))
Output:
date C1 C2 C3 C4 C5 C6 C7 C8
0 2022-02-10 2 6 3 8 2 8 8 3
1 2022-02-11 1 4 9 3 8 3 1 9
2 2022-02-12 1 6 2 7 8 3 4 0
CodePudding user response:
Let's have for example this dataframe:
df = pd.DataFrame(np.random.randint(100, size = 25))
df.columns = ["col1"]
df["ColDat"] = df["col1"].apply(lambda x:[x*i for i in range(33)])
df
out:
col1 ColDat
0 40 [0, 40, 80, 120, 160, 200, 240, 280, 320, 360,...
1 5 [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55,...
2 42 [0, 42, 84, 126, 168, 210, 252, 294, 336, 378,...
3 67 [0, 67, 134, 201, 268, 335, 402, 469, 536, 603...
4 94 [0, 94, 188, 282, 376, 470, 564, 658, 752, 846...
5 35 [0, 35, 70, 105, 140, 175, 210, 245, 280, 315,...
6 43 [0, 43, 86, 129, 172, 215, 258, 301, 344, 387,...
7 57 [0, 57, 114, 171, 228, 285, 342, 399, 456, 513...
8 82 [0, 82, 164, 246, 328, 410, 492, 574, 656, 738...
9 89 [0, 89, 178, 267, 356, 445, 534, 623, 712, 801...
10 74 [0, 74, 148, 222, 296, 370, 444, 518, 592, 666...
11 21 [0, 21, 42, 63, 84, 105, 126, 147, 168, 189, 2...
12 66 [0, 66, 132, 198, 264, 330, 396, 462, 528, 594...
13 53 [0, 53, 106, 159, 212, 265, 318, 371, 424, 477...
14 15 [0, 15, 30, 45, 60, 75, 90, 105, 120, 135, 150...
15 28 [0, 28, 56, 84, 112, 140, 168, 196, 224, 252, ...
16 27 [0, 27, 54, 81, 108, 135, 162, 189, 216, 243, ...
17 76 [0, 76, 152, 228, 304, 380, 456, 532, 608, 684...
18 35 [0, 35, 70, 105, 140, 175, 210, 245, 280, 315,...
19 7 [0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77,...
20 47 [0, 47, 94, 141, 188, 235, 282, 329, 376, 423,...
21 70 [0, 70, 140, 210, 280, 350, 420, 490, 560, 630...
22 65 [0, 65, 130, 195, 260, 325, 390, 455, 520, 585...
23 48 [0, 48, 96, 144, 192, 240, 288, 336, 384, 432,...
24 44 [0, 44, 88, 132, 176, 220, 264, 308, 352, 396,...
ColDat
(in my case) is a list of 33 int
:
to expand lists into new columns you can:
pd.concat([df, df.ColDat.apply(pd.Series)], axis=1)
out:
col1 ColDat 0 1 2 3 4 5 6 7 ... 23 24 25 26 27 28 29 30 31 32
0 40 [0, 40, 80, 120, 160, 200, 240, 280, 320, 360,... 0 40 80 120 160 200 240 280 ... 920 960 1000 1040 1080 1120 1160 1200 1240 1280
1 5 [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55,... 0 5 10 15 20 25 30 35 ... 115 120 125 130 135 140 145 150 155 160
2 42 [0, 42, 84, 126, 168, 210, 252, 294, 336, 378,... 0 42 84 126 168 210 252 294 ... 966 1008 1050 1092 1134 1176 1218 1260 1302 1344
3 67 [0, 67, 134, 201, 268, 335, 402, 469, 536, 603... 0 67 134 201 268 335 402 469 ... 1541 1608 1675 1742 1809 1876 1943 2010 2077 2144
4 94 [0, 94, 188, 282, 376, 470, 564, 658, 752, 846... 0 94 188 282 376 470 564 658 ... 2162 2256 2350 2444 2538 2632 2726 2820 2914 3008
5 35 [0, 35, 70, 105, 140, 175, 210, 245, 280, 315,... 0 35 70 105 140 175 210 245 ... 805 840 875 910 945 980 1015 1050 1085 1120
6 43 [0, 43, 86, 129, 172, 215, 258, 301, 344, 387,... 0 43 86 129 172 215 258 301 ... 989 1032 1075 1118 1161 1204 1247 1290 1333 1376
7 57 [0, 57, 114, 171, 228, 285, 342, 399, 456, 513... 0 57 114 171 228 285 342 399 ... 1311 1368 1425 1482 1539 1596 1653 1710 1767 1824
8 82 [0, 82, 164, 246, 328, 410, 492, 574, 656, 738... 0 82 164 246 328 410 492 574 ... 1886 1968 2050 2132 2214 2296 2378 2460 2542 2624
9 89 [0, 89, 178, 267, 356, 445, 534, 623, 712, 801... 0 89 178 267 356 445 534 623 ... 2047 2136 2225 2314 2403 2492 2581 2670 2759 2848
10 74 [0, 74, 148, 222, 296, 370, 444, 518, 592, 666... 0 74 148 222 296 370 444 518 ... 1702 1776 1850 1924 1998 2072 2146 2220 2294 2368
11 21 [0, 21, 42, 63, 84, 105, 126, 147, 168, 189, 2... 0 21 42 63 84 105 126 147 ... 483 504 525 546 567 588 609 630 651 672
12 66 [0, 66, 132, 198, 264, 330, 396, 462, 528, 594... 0 66 132 198 264 330 396 462 ... 1518 1584 1650 1716 1782 1848 1914 1980 2046 2112
13 53 [0, 53, 106, 159, 212, 265, 318, 371, 424, 477... 0 53 106 159 212 265 318 371 ... 1219 1272 1325 1378 1431 1484 1537 1590 1643 1696
14 15 [0, 15, 30, 45, 60, 75, 90, 105, 120, 135, 150... 0 15 30 45 60 75 90 105 ... 345 360 375 390 405 420 435 450 465 480
15 28 [0, 28, 56, 84, 112, 140, 168, 196, 224, 252, ... 0 28 56 84 112 140 168 196 ... 644 672 700 728 756 784 812 840 868 896
16 27 [0, 27, 54, 81, 108, 135, 162, 189, 216, 243, ... 0 27 54 81 108 135 162 189 ... 621 648 675 702 729 756 783 810 837 864
17 76 [0, 76, 152, 228, 304, 380, 456, 532, 608, 684... 0 76 152 228 304 380 456 532 ... 1748 1824 1900 1976 2052 2128 2204 2280 2356 2432
18 35 [0, 35, 70, 105, 140, 175, 210, 245, 280, 315,... 0 35 70 105 140 175 210 245 ... 805 840 875 910 945 980 1015 1050 1085 1120
19 7 [0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77,... 0 7 14 21 28 35 42 49 ... 161 168 175 182 189 196 203 210 217 224
20 47 [0, 47, 94, 141, 188, 235, 282, 329, 376, 423,... 0 47 94 141 188 235 282 329 ... 1081 1128 1175 1222 1269 1316 1363 1410 1457 1504
21 70 [0, 70, 140, 210, 280, 350, 420, 490, 560, 630... 0 70 140 210 280 350 420 490 ... 1610 1680 1750 1820 1890 1960 2030 2100 2170 2240
22 65 [0, 65, 130, 195, 260, 325, 390, 455, 520, 585... 0 65 130 195 260 325 390 455 ... 1495 1560 1625 1690 1755 1820 1885 1950 2015 2080
23 48 [0, 48, 96, 144, 192, 240, 288, 336, 384, 432,... 0 48 96 144 192 240 288 336 ... 1104 1152 1200 1248 1296 1344 1392 1440 1488 1536
24 44 [0, 44, 88, 132, 176, 220, 264, 308, 352, 396,... 0 44 88 132 176 220 264 308 ... 1012 1056 1100 1144 1188 1232 1276 1320 1364 1408
if you want to remove the original ColDat
column you have to add a .drop("ColDat", axis=1)
at the end of the pd.concat
call.
Note that you are not cycling, so this should be faster than other possible solution that include a for
loop in a lambda
function (for example).
CodePudding user response:
I tried manipulation of array, here is my out put
so you can copy the code
d=[[[2,6,3,8,2,8,8,3]], [[1,4,9,3,8,3,1,9]], [[1,6,2,7,8,3,4,0]]]
data=pd.DataFrame(d, columns=['col'])
new_data=np.array(data.values.tolist())
pd.DataFrame(new_data.reshape(3, 8).tolist())