I'm struggling to make my pandas data loading code look "good", I would like to adhere as much as possible to Pep8 with for example at most 80 characters per line. But right now my lines are way too long because of the (unwieldy) way that pandas works. For example:
df_ndsi_feature = df_stations_date.loc[:, df_stations_date.columns.str.fullmatch("ndsi_avg-\\d")]
ndsi_feature = df_ndsi_feature.to_numpy(dtype=np.float32).T
df_ndvi_feature = df_stations_date.loc[:, df_stations_date.columns.str.fullmatch("ndvi_avg-\\d")]
ndvi_feature = df_ndvi_feature.to_numpy(dtype=np.float32).T
ndsi_ndvi_feature = np.append(ndsi_feature, ndvi_feature, axis=1)
date_features = np.append(date_features, ndsi_ndvi_feature, axis=1)
As you can see they are very long, but I don't really know how to properly break them.
CodePudding user response:
I would put the line breaks after each argument.
For example:
df_ndsi_feature = df_stations_date.loc[:,
df_stations_date.columns.str.fullmatch("ndsi_avg-\\d")]
ndsi_feature = df_ndsi_feature.to_numpy(dtype=np.float32).T
df_ndvi_feature = df_stations_date.loc[:,
df_stations_date.columns.str.fullmatch("ndvi_avg-\\d")]
ndvi_feature = df_ndvi_feature.to_numpy(dtype=np.float32).T
ndsi_ndvi_feature = np.append(ndsi_feature, ndvi_feature, axis=1)
date_features = np.append(date_features, ndsi_ndvi_feature, axis=1)
The indented lines could even be pushed further in if needed but its best when the arguments can line up for clarity and readability.
For example:
df_ndsi_feature = df_stations_date.loc[:,
df_stations_date.columns.str.fullmatch("ndsi_avg-\\d")]
This would also work but is less neat.
Also I found this great Q&A for further explanation and examples of this style of solution here
CodePudding user response:
My advise is use a formatter tool like black across all the team and forget to manually try to format the code according to a standard consistent to the code written by all.
pep8 has "many" optional ways of doing things so is hard to achieve a common format event complying with the standard.
If you are using Jupyter as IDE you can try jupyter-black, plugin.
This is your code sample formatted with black:
df_ndsi_feature = df_stations_date.loc[
:, df_stations_date.columns.str.fullmatch("ndsi_avg-\\d")
]
ndsi_feature = df_ndsi_feature.to_numpy(dtype=np.float32).T
df_ndvi_feature = df_stations_date.loc[
:, df_stations_date.columns.str.fullmatch("ndvi_avg-\\d")
]
ndvi_feature = df_ndvi_feature.to_numpy(dtype=np.float32).T
ndsi_ndvi_feature = np.append(ndsi_feature, ndvi_feature, axis=1)
date_features = np.append(date_features, ndsi_ndvi_feature, axis=1)
But take into account that it can be formatted in different ways and still comply with pep8:
For example:
df_ndsi_feature = df_stations_date.loc[
:,
df_stations_date.columns.str.fullmatch("ndsi_avg-\\d")
]
ndsi_feature = df_ndsi_feature.to_numpy(dtype=np.float32).T
df_ndvi_feature = df_stations_date.loc[
:,
df_stations_date.columns.str.fullmatch("ndvi_avg-\\d")
]
ndvi_feature = df_ndvi_feature.to_numpy(dtype=np.float32).T
ndsi_ndvi_feature = np.append(ndsi_feature, ndvi_feature, axis=1)
date_features = np.append(date_features, ndsi_ndvi_feature, axis=1)
Also you can use a tool like flake8 (my preference go to use the plugin wemake-python-styleguide) to check for formatting and other issues.