Currently I have this Situation:
signal_name timestamp signal_value
0 alert 1632733513 on
1 alert 1632733515 off
2 alert 1632733518 on
I want to rename the column signal_value
with the signal_name
. The df was filtered after the signal name alert
so there is no other value for signal_name
.
signal_name timestamp alert
0 alert 1632733513 on
1 alert 1632733515 off
2 alert 1632733518 on
Due to the fact that the signal name is addressed, the first column is no longer needed. So I would like to drop it.
timestamp alert
0 1632733513 on
1 1632733515 off
2 1632733518 on
Since there are multiple df (based on other signal_name
) with this problem, this approach should be generic.
CodePudding user response:
If you control the part where the dataframe is filtered on the signal_name
then you can rename the column with the same value used in the filter.
Otherwise, you can select the first value of signal_name
column into python variable then use it to rename the column signal_value
:
data = [("alert", "1632733513", "on"), ("alert", "1632733515", "off"), ("alert", "1632733518", "on")]
df = spark.createDataFrame(data, ["signal_name", "timestamp", "signal_value"])
signal_name = df.select("signal_name").first().signal_name
df1 = df.withColumnRenamed("signal_value", signal_name).drop("signal_name")
df1.show()
# ---------- -----
# | timestamp|alert|
# ---------- -----
# |1632733513| on|
# |1632733515| off|
# |1632733518| on|
# ---------- -----