Home > OS >  Drop columns in series/dataframe MultiIndex
Drop columns in series/dataframe MultiIndex

Time:08-20

I am not sure if i have a dataframe or a series. After I get my data into the df or series, I would like to drop those first two columns and it will not work.

csv_data1 = pd.read_csv("file1.csv", delimiter=",") 
csv_data1["Date"] = csv_data1["Date"].astype("datetime64")

csv_data2 = pd.read_csv("file2.csv", delimiter=",")
csv_data2["Date"] = csv_data2["Date"].astype("datetime64")

csv_data1.set_index(["Vendor ID", "PO #", "Item ID"], drop=False, inplace=True)
csv_data2.set_index(["Vendor ID", "PO #", "Item ID"], drop=False, inplace=True)

difference = ((csv_data1["Date"] - csv_data2["Date"]).dt.days / 7).abs()

Here is my DF/Series:

Vendor ID  PO #      Item ID  
TRLIM      20210339  X18TE1779    17.714286
                     X18TE1779    17.714286
                     X18TE1779    17.714286
                     X18TE1779    17.714286
                     X18TE1780    17.714286
                                    ...
TRSSP      NaN       X13SL0458          NaN
                     X15TE0334          NaN
                     X17TR1674          NaN
                     X32TR2654          NaN
                     X50TE7420          NaN

I would like to drop the Vendor ID and the PO # index/column I have tried this:

difference.drop(labels=['Vendor ID', 'PO #'], axis = 1)

Which gives me:

ValueError: No axis named 1 for object type Series

As well as this:

difference.dropna(inplace = True)

So how could I do this? If it is a series, is there a way to convert it to a DF? I have tried difference.to_frame(), but I am not sure how to check.

CodePudding user response:

You need to use droplevel:

out = difference.droplevel(['Vendor ID', 'PO #'])

output:

Item ID
X18TE1779    17.714286
X18TE1779    17.714286
X18TE1779    17.714286
X18TE1779    17.714286
X18TE1780    17.714286
X13SL0458          NaN
X15TE0334          NaN
X17TR1674          NaN
X32TR2654          NaN
X50TE7420          NaN
dtype: float64

CodePudding user response:

difference.reset_index(drop=True) should work since you have a created an index...

difference.drop() is used for columns and rows, since you have created an index use the function difference.reset_index(drop=True)

  • Related