For a dataframe such as this:
Col1 | Col2 | |
---|---|---|
1 | A | D |
2 | B | A |
3 | C | B |
Desired outcome:
Unique occurrences of values in Col1 and Col2 in order of appearance by row
i.e. unique_list = [A, D, B, C]
Problem
Way to minimise iteration and processing due to number and size of dataframes
CodePudding user response:
Use DataFrame.iloc
for select first 2 columns, reshape by DataFrame.stack
and get unique values in Series.unique
:
unique_list = df.iloc[:, :2].stack().unique().tolist()
print (unique_list)
['A', 'D', 'B', 'C']