I'm working on consolidating some mathematical information relating to my research, and I want to create a pandas dataframe to store some numerical values. All the values come from functions that correspond to the row index n - this is crucial. Right now, I am using a list of lists scheme to store and access everything. Here is a MWE of my current setup:
a=[[0]]
b=[[0]]
for n in range(1,10):
a.append(int(floor((2*n)/3)))
temp=[]
for k in range(1,n):
temp.append(int(floor((n k-1)/(k 1))))
b.append(temp)
When I want to consolidate everything, I simply cast the list as sets and read off the index to get the values I desire:
fix=[[0]]
for i in range(1,10):
fix.append(list((set(a[i]) | set(b[i]))))
My desire to convert this structure to a dataframe is so that I have more tools to do analysis/plotting with (I can elaborate on some other considerations too if it is needed). This is how I have started to create my dataframe:
df=pd.DataFrame([a,b])
df=df.transpose()
df.columns=['Graph X','Kth Operation of Graph X']
This gives me output that looks like:
df
Graph X Kth Operation of Graph X
0 [0] [0]
1 0 []
2 1 [1]
3 2 [1, 1]
4 2 [2, 1, 1]
5 3 [2, 2, 1, 1]
6 4 [3, 2, 2, 1, 1]
7 4 [3, 2, 2, 2, 1, 1]
8 5 [4, 3, 2, 2, 2, 1, 1]
9 6 [4, 3, 2, 2, 2, 2, 1, 1]
Now, my ultimate goal is to create a separate column for every value of k and appropriately name it. For example, I would like to have "Operation 1 on Graph X" with values [0, ,1,1,2,2,3,3,4,4] and so forth (bear in mind that ideally there will be multiple functions that create multiple columns). However, dredging through SE and random tutorials has got me no closer to this goal than I would have thought.
I appreciate any feedback or suggestions and am open to reformulating how I should go about this.
CodePudding user response:
Say you have a dataframe df
that looks like the one you constructed, but with two columns each for several functions, all named according to the same scheme, e.g.
Graph X Kth Operation on Graph X Graph Z3 Kth Operation on Graph Z3
0 [0] [0] [0] [0]
1 0 [] 1 []
2 1 [1] 10 [2]
3 2 [1, 1] 20 [2, 2]
4 2 [2, 1, 1] 20 [4, 2, 2]
5 3 [2, 2, 1, 1] 300 [6, 6, 3, 3]
6 4 [3, 2, 2, 1, 1] 400 [6, 4, 4, 1, 1]
7 4 [3, 2, 2, 2, 1, 1] 400 [6, 2, 2, 2, 1, 1]
8 5 [4, 3, 2, 2, 2, 1, 1] 800 [8, 3, 2, 2, 2, 1, 1]
9 6 [4, 3, 2, 2, 2, 2, 1, 1] 1000 [8, 3, 2, 2, 2, 2, 1, 1]
Then you can extract and create the appropriate column names with list comprehensions and format strings. Iterating over the functions, you can add the list values as individual columns:
graph_column_names = [name for name in df.columns
if name.startswith('Graph')]
for graph in graph_column_names:
operation_column_names = [f'Operation {k} on {graph}' for k in range(1, n)]
df[operation_column_names] = pd.DataFrame(df[f'Kth Operation on {graph}'].to_list(),
columns=operation_column_names)