Home > Blockchain >  Dataframe add multiple columns from list with each column name created
Dataframe add multiple columns from list with each column name created

Time:12-10

There is a list of dicts d, in which x is an embedded list, e.g.,

d = [{"name":"Python", "x":[0,1,2,3,4,5]},  # x has 300 elements
     {"name":"C  ", "x":[0,1,0,3,4,4]},
     {"name":"Java","x":[0,4,5,6,1]}]

I want to transform d to Dataframe, and add columns automatically for each element in x that the added column name has a prefix "abc", e,g.,

df.columns = ["name", "abc0", "abc1", ..., "abc300"]

I'm looking for an efficient way, as d has lots of dicts . When I manually added columns, Python says

PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead.  To get a de-fragmented frame, use `newframe = frame.copy()`

CodePudding user response:

I hope this is what you need. If it help do upvote and accept the answer.

d = {
  "name": "abc",
  "x":[i for i in range(300)]  # 300 elements
}

df = pd.DataFrame(d)
df = df.T
df.columns = [i str(idx) for idx, i in enumerate(df.iloc[0])]
df.drop(index=df.index[0], axis=0, inplace=True)
df
Out[91]: 
  abc0 abc1 abc2 abc3 abc4 abc5 abc6 abc7 abc8 abc9  ... abc290 abc291 abc292  \
x    0    1    2    3    4    5    6    7    8    9  ...    290    291    292   

  abc293 abc294 abc295 abc296 abc297 abc298 abc299  
x    293    294    295    296    297    298    299  

[1 rows x 300 columns]

CodePudding user response:

Are you looking for something like this:

d = [{"name":"Python", "x":[0,1,2,3,4,5]},  # x has 300 elements
     {"name":"C  ", "x":[0,1,0,3,4,4]},
     {"name":"Java","x":[0,4,5,6,1]}]

df = pd.DataFrame(
    {
        "name": record["name"],
        **{f"abc{i}": n for i, n in enumerate(record["x"])}
    }
    for record in d
)

Result for your sample:

     name  abc0  abc1  abc2  abc3  abc4  abc5
0  Python     0     1     2     3     4   5.0
1     C       0     1     0     3     4   4.0
2    Java     0     4     5     6     1   NaN

CodePudding user response:

You can take all content of the list of dictionaries and turn it into a list of strings with the following list comprehension

column_names = [p['name'] str(p['x'][idx]) for p in d for idx in range(len(p['x']))]

for your example, you obtain

['Python0', 'Python1', 'Python2', 'Python3', 'Python4', 'Python5', 'C  0', 'C  1', 'C  0', 'C  3', 'C  4', 'C  4', 'Java0', 'Java4', 'Java5', 'Java6', 'Java1']

and then you can construct an empty DataFrame with

df = pandas.DataFrame(columns=column_names)
  • Related