I am not sure if i have a dataframe or a series. After I get my data into the df or series, I would like to drop those first two columns and it will not work.
csv_data1 = pd.read_csv("file1.csv", delimiter=",")
csv_data1["Date"] = csv_data1["Date"].astype("datetime64")
csv_data2 = pd.read_csv("file2.csv", delimiter=",")
csv_data2["Date"] = csv_data2["Date"].astype("datetime64")
csv_data1.set_index(["Vendor ID", "PO #", "Item ID"], drop=False, inplace=True)
csv_data2.set_index(["Vendor ID", "PO #", "Item ID"], drop=False, inplace=True)
difference = ((csv_data1["Date"] - csv_data2["Date"]).dt.days / 7).abs()
Here is my DF/Series:
Vendor ID PO # Item ID
TRLIM 20210339 X18TE1779 17.714286
X18TE1779 17.714286
X18TE1779 17.714286
X18TE1779 17.714286
X18TE1780 17.714286
...
TRSSP NaN X13SL0458 NaN
X15TE0334 NaN
X17TR1674 NaN
X32TR2654 NaN
X50TE7420 NaN
I would like to drop the Vendor ID
and the PO #
index/column
I have tried this:
difference.drop(labels=['Vendor ID', 'PO #'], axis = 1)
Which gives me:
ValueError: No axis named 1 for object type Series
As well as this:
difference.dropna(inplace = True)
So how could I do this? If it is a series, is there a way to convert it to a DF? I have tried difference.to_frame()
, but I am not sure how to check.
CodePudding user response:
You need to use droplevel
:
out = difference.droplevel(['Vendor ID', 'PO #'])
output:
Item ID
X18TE1779 17.714286
X18TE1779 17.714286
X18TE1779 17.714286
X18TE1779 17.714286
X18TE1780 17.714286
X13SL0458 NaN
X15TE0334 NaN
X17TR1674 NaN
X32TR2654 NaN
X50TE7420 NaN
dtype: float64
CodePudding user response:
difference.reset_index(drop=True)
should work since you have a created an index...
difference.drop()
is used for columns and rows, since you have created an index use the function difference.reset_index(drop=True)