I am trying to split a dataframe and to plot the create subsets using a for loop in jupyter notebooks:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = pd.DataFrame(np.random.rand(120, 10))
for i in range(len(data) // 10):
split_data = data.iloc[i:i*(len(data) // 10)]
for i, row in split_data.iterrows():
row.plot(figsize=(15,5))
plt.show()
Here the the figures being plotted are not reset, rather the curves of the first figure appear again on the second one and so on.
How can i reset the plot so that i have only 10 curves per plot?
CodePudding user response:
Your calculation in the loop in incorrect. This generates the indices:
0:0, 1:12, 2:24, 3:36, 4:48, 5:60, 6:72, 7:84, 8:96, 9:108, 10:120, 11:132
But anyway, avoid the manual indexing. Use a groupby
to split your DataFrame:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = pd.DataFrame(np.random.rand(120, 10))
for k, split_data in data.groupby(np.arange(len(data))//10):
split_data.plot()
plt.show()