Home > Blockchain >  Is there a way to not hard code elements of a list to go in different columns of a Pandas data frame
Is there a way to not hard code elements of a list to go in different columns of a Pandas data frame

Time:04-12

I have a column in a Pandas dataframe that looks like this

Date       ColDat
===================================
2022-02-10 [2,6,3,8,2,8,8,3]
2022-02-11 [1,4,9,3,8,3,1,9]
2022-02-12 [1,6,2,7,8,3,4,0]

where ColDat is a list of integers.

I don't want to have to hard code the mapping of each entry of the list into a different column. I did this for the time being, but it's probably not best-practice. Is there a way to iterate and automatically assign iterative column names to have a result like the following?

Date       C1 C2 C3 C4 ... C8
===================================
2022-02-10 2  6  3  8  ... 3
2022-02-11 1  4  9  3  ... 9
2022-02-12 1  6  2  7  ... 0

CodePudding user response:

d = {'date': ['2022-02-10', '2022-02-11', '2022-02-12'],
    'coldat': [[2,6,3,8,2,8,8,3],[1,4,9,3,8,3,1,9],[1,6,2,7,8,3,4,0]]
}
df = pd.DataFrame(data=d)

new_df = df.coldat.apply(pd.Series)
new_df.columns = ["C" str(x 1) for x in range(len(new_df.columns))]
new_df.index = df.date

CodePudding user response:

Here is another way:

(df[['date']].join(pd.DataFrame(df['coldat'].tolist())
    .rename(lambda x: 'C{}'.format(x 1),axis=1)))

Output:

         date  C1  C2  C3  C4  C5  C6  C7  C8
0  2022-02-10   2   6   3   8   2   8   8   3
1  2022-02-11   1   4   9   3   8   3   1   9
2  2022-02-12   1   6   2   7   8   3   4   0

CodePudding user response:

Let's have for example this dataframe:

df = pd.DataFrame(np.random.randint(100, size = 25))
df.columns = ["col1"]
df["ColDat"] = df["col1"].apply(lambda x:[x*i for i in range(33)])

df

out:

   col1 ColDat
0   40  [0, 40, 80, 120, 160, 200, 240, 280, 320, 360,...
1   5   [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55,...
2   42  [0, 42, 84, 126, 168, 210, 252, 294, 336, 378,...
3   67  [0, 67, 134, 201, 268, 335, 402, 469, 536, 603...
4   94  [0, 94, 188, 282, 376, 470, 564, 658, 752, 846...
5   35  [0, 35, 70, 105, 140, 175, 210, 245, 280, 315,...
6   43  [0, 43, 86, 129, 172, 215, 258, 301, 344, 387,...
7   57  [0, 57, 114, 171, 228, 285, 342, 399, 456, 513...
8   82  [0, 82, 164, 246, 328, 410, 492, 574, 656, 738...
9   89  [0, 89, 178, 267, 356, 445, 534, 623, 712, 801...
10  74  [0, 74, 148, 222, 296, 370, 444, 518, 592, 666...
11  21  [0, 21, 42, 63, 84, 105, 126, 147, 168, 189, 2...
12  66  [0, 66, 132, 198, 264, 330, 396, 462, 528, 594...
13  53  [0, 53, 106, 159, 212, 265, 318, 371, 424, 477...
14  15  [0, 15, 30, 45, 60, 75, 90, 105, 120, 135, 150...
15  28  [0, 28, 56, 84, 112, 140, 168, 196, 224, 252, ...
16  27  [0, 27, 54, 81, 108, 135, 162, 189, 216, 243, ...
17  76  [0, 76, 152, 228, 304, 380, 456, 532, 608, 684...
18  35  [0, 35, 70, 105, 140, 175, 210, 245, 280, 315,...
19  7   [0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77,...
20  47  [0, 47, 94, 141, 188, 235, 282, 329, 376, 423,...
21  70  [0, 70, 140, 210, 280, 350, 420, 490, 560, 630...
22  65  [0, 65, 130, 195, 260, 325, 390, 455, 520, 585...
23  48  [0, 48, 96, 144, 192, 240, 288, 336, 384, 432,...
24  44  [0, 44, 88, 132, 176, 220, 264, 308, 352, 396,...

ColDat (in my case) is a list of 33 int:

to expand lists into new columns you can:

pd.concat([df, df.ColDat.apply(pd.Series)], axis=1)

out:

    col1                                             ColDat 0   1   2   3   4   5   6   7   ... 23  24  25  26  27  28  29  30  31  32
0   40  [0, 40, 80, 120, 160, 200, 240, 280, 320, 360,...   0   40  80  120 160 200 240 280 ... 920 960 1000    1040    1080    1120    1160    1200    1240    1280
1   5   [0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55,...   0   5   10  15  20  25  30  35  ... 115 120 125 130 135 140 145 150 155 160
2   42  [0, 42, 84, 126, 168, 210, 252, 294, 336, 378,...   0   42  84  126 168 210 252 294 ... 966 1008    1050    1092    1134    1176    1218    1260    1302    1344
3   67  [0, 67, 134, 201, 268, 335, 402, 469, 536, 603...   0   67  134 201 268 335 402 469 ... 1541    1608    1675    1742    1809    1876    1943    2010    2077    2144
4   94  [0, 94, 188, 282, 376, 470, 564, 658, 752, 846...   0   94  188 282 376 470 564 658 ... 2162    2256    2350    2444    2538    2632    2726    2820    2914    3008
5   35  [0, 35, 70, 105, 140, 175, 210, 245, 280, 315,...   0   35  70  105 140 175 210 245 ... 805 840 875 910 945 980 1015    1050    1085    1120
6   43  [0, 43, 86, 129, 172, 215, 258, 301, 344, 387,...   0   43  86  129 172 215 258 301 ... 989 1032    1075    1118    1161    1204    1247    1290    1333    1376
7   57  [0, 57, 114, 171, 228, 285, 342, 399, 456, 513...   0   57  114 171 228 285 342 399 ... 1311    1368    1425    1482    1539    1596    1653    1710    1767    1824
8   82  [0, 82, 164, 246, 328, 410, 492, 574, 656, 738...   0   82  164 246 328 410 492 574 ... 1886    1968    2050    2132    2214    2296    2378    2460    2542    2624
9   89  [0, 89, 178, 267, 356, 445, 534, 623, 712, 801...   0   89  178 267 356 445 534 623 ... 2047    2136    2225    2314    2403    2492    2581    2670    2759    2848
10  74  [0, 74, 148, 222, 296, 370, 444, 518, 592, 666...   0   74  148 222 296 370 444 518 ... 1702    1776    1850    1924    1998    2072    2146    2220    2294    2368
11  21  [0, 21, 42, 63, 84, 105, 126, 147, 168, 189, 2...   0   21  42  63  84  105 126 147 ... 483 504 525 546 567 588 609 630 651 672
12  66  [0, 66, 132, 198, 264, 330, 396, 462, 528, 594...   0   66  132 198 264 330 396 462 ... 1518    1584    1650    1716    1782    1848    1914    1980    2046    2112
13  53  [0, 53, 106, 159, 212, 265, 318, 371, 424, 477...   0   53  106 159 212 265 318 371 ... 1219    1272    1325    1378    1431    1484    1537    1590    1643    1696
14  15  [0, 15, 30, 45, 60, 75, 90, 105, 120, 135, 150...   0   15  30  45  60  75  90  105 ... 345 360 375 390 405 420 435 450 465 480
15  28  [0, 28, 56, 84, 112, 140, 168, 196, 224, 252, ...   0   28  56  84  112 140 168 196 ... 644 672 700 728 756 784 812 840 868 896
16  27  [0, 27, 54, 81, 108, 135, 162, 189, 216, 243, ...   0   27  54  81  108 135 162 189 ... 621 648 675 702 729 756 783 810 837 864
17  76  [0, 76, 152, 228, 304, 380, 456, 532, 608, 684...   0   76  152 228 304 380 456 532 ... 1748    1824    1900    1976    2052    2128    2204    2280    2356    2432
18  35  [0, 35, 70, 105, 140, 175, 210, 245, 280, 315,...   0   35  70  105 140 175 210 245 ... 805 840 875 910 945 980 1015    1050    1085    1120
19  7   [0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77,...   0   7   14  21  28  35  42  49  ... 161 168 175 182 189 196 203 210 217 224
20  47  [0, 47, 94, 141, 188, 235, 282, 329, 376, 423,...   0   47  94  141 188 235 282 329 ... 1081    1128    1175    1222    1269    1316    1363    1410    1457    1504
21  70  [0, 70, 140, 210, 280, 350, 420, 490, 560, 630...   0   70  140 210 280 350 420 490 ... 1610    1680    1750    1820    1890    1960    2030    2100    2170    2240
22  65  [0, 65, 130, 195, 260, 325, 390, 455, 520, 585...   0   65  130 195 260 325 390 455 ... 1495    1560    1625    1690    1755    1820    1885    1950    2015    2080
23  48  [0, 48, 96, 144, 192, 240, 288, 336, 384, 432,...   0   48  96  144 192 240 288 336 ... 1104    1152    1200    1248    1296    1344    1392    1440    1488    1536
24  44  [0, 44, 88, 132, 176, 220, 264, 308, 352, 396,...   0   44  88  132 176 220 264 308 ... 1012    1056    1100    1144    1188    1232    1276    1320    1364    1408

if you want to remove the original ColDat column you have to add a .drop("ColDat", axis=1) at the end of the pd.concat call.

Note that you are not cycling, so this should be faster than other possible solution that include a for loop in a lambda function (for example).

CodePudding user response:

I tried manipulation of array, here is my out put

enter image description here

so you can copy the code

d=[[[2,6,3,8,2,8,8,3]], [[1,4,9,3,8,3,1,9]], [[1,6,2,7,8,3,4,0]]]
data=pd.DataFrame(d, columns=['col'])
new_data=np.array(data.values.tolist())
pd.DataFrame(new_data.reshape(3, 8).tolist())
  • Related