I have a certain data set called "df" existing of 5000 rows and 32 columns. I want to plot 16 graphs by using a for loop. There are two problems that I cannot overcome: The plot does not show when using this code:
proef_numbers = [1,2,3,4,5]
def plot_results(df, proef_numbers, title):
for proef in proef_numbers:
for test in range(1,2,3,4,5):
S_data = df[f"S_{proef}_{test}"][1:DATA_END_VALUES[proef-1][test-1]]
F_data = df[f"F_{proef}_{test}"][1:DATA_END_VALUES[proef-1][test-1]]-F0
plt.plot(S_data, F_data, label=f"Proef {proef} test {test}" )
plt.xlabel('Time [s]')
plt.ylabel('Force [N]')
plt.title(f"Proef {proef}, test {test}")
plt.legend()
plt.show()
After this I tried something else and restructured my data set and I wanted to use the following for loop:
for i in range(1,17):
plt.plot(df[i],df[i 16])
plt.show()
Then I get the error:
KeyError: 1
For some reason, I cannot even print(df[1]
) anymore. It will give me "KeyError: 1" also. As you have probably guessed by now I am very new to python.
CodePudding user response:
If you want to see multiple plots at the same time, like a grid of plots, I suggest looking at using sublots: https://matplotlib.org/stable/gallery/subplots_axes_and_figures/subplots_demo.html
For indexing your dataframe you should use .loc method. Have a look at:
https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html
Since you are new to python, I would suggest learning using NumPy arrays. You can convert your dataframe directly to a NumPy array then plot slices of it.
CodePudding user response:
There are a couple problems with the code that could be causing problems.
First, the range function behaves differently from how you use it in the top code block. Range is defined as range(start, end, step) where the start number is included in the range, the end number is not included, and the step is 1 by default. The way that the top code is now, it should not even run. If you want to make it easier to understand for yourself, you could replace range(1,5) (range(1,2,3,4,5) in the code above) with [1,2,3,4] since you can use a for statement to iterate over a list like you can for a range object.
Also, how are you calling the function? In the code example that you gave, you don't have the call to the function. If you don't call the function, it does not execute the code. If you don't want to use a function, that is okay, but it will change the code to be the code below. The function just makes the code more flexible if you want to make different variations of plots.
proef_numbers = [1,2,3,4]
for proef in proef_numbers:
for test in range(1,5):
S_data = df[f"S_{proef}_{test}"][1:DATA_END_VALUES[proef-1][test-1]]
F_data = df[f"F_{proef}_{test}"][1:DATA_END_VALUES[proef-1][test-1]]-F0
plt.plot(S_data, F_data, label=f"Proef {proef} test {test}" )
plt.xlabel('Time [s]')
plt.ylabel('Force [N]')
plt.title(f"Proef {proef}, test {test}")
plt.legend()
plt.show()
I tested it with dummy data from your other question, and it seems to work.
For your other question, it seems that you want to try to index columns by number, right? As this question shows, you can use .iloc for your pandas dataframe to locate by index (instead of column name). So you will change the second block of code to this:
for i in range(1,17):
plt.plot(df.iloc[:,i],df.iloc[:,i 16])
plt.show()
For this, the df.iloc[:,i]
means that you are looking at all the rows (when used by itself, : means all of the elements) and i
means the ith column. Keep in mind that python is zero indexed, so the first column would be 0. In that case, you might want to change range(1,17)
to range(0,16)
or simply range(16)
since range
defaults to a start value of 0.
I would highly recommend against locating by index though. If you have good column names, you should use those instead since it is more robust. When you select a column by name, you get exactly what you want. When you select by index, there could be a small chance of error if your columns get shuffled for some strange reason.