I would like to get the position of columns with the same name (that is column A).
DataFrame a:
A B A C text1 text3 text5 text7 text2 text4 text6 text8
I can get position of column A but how to get the position of the second column. There are multiple dataframe with different number of columns and position of A are not the same across the dataframes. Thank you.
for col in a.columns:
if col == 'A':
indx1 = a.columns.get_loc(col)
#if second column A
indx2 = a.columns.get_loc(col)
CodePudding user response:
Your result can be easily achieved using np.where()
.
df = pd.DataFrame(
data=[["text1", "text2", "text5", "text7"], ["text2", "text4", "text6", "text8"]],
columns=["A", "B", "A", "D"],
)
np.where(df.columns == "A")[0]
Output:
array([0, 2], dtype=int64)
CodePudding user response:
res = []
for index, col in enumerate(a.columns):
if col == 'A':
res.append(index)
print(res)
This will give you the position of all columns with the same name
CodePudding user response:
As a one liner, this returns the index positions of columns which are repeated:
indexes = [i for i, j in zip(range(len(df.columns)), df.columns) if j in df.loc[:, df.columns.value_counts() > 1].columns]
It returns: [0, 2]
in this case because column A is repeated.
CodePudding user response:
if find 'A':
np.where(df.columns == 'A')[0]
result:
array([0, 2], dtype=int64)
if find all duplicated column name:
np.where(df.columns.duplicated(keep=False))[0]
result:
array([0, 2], dtype=int64)