please help me sort df into df1, in other words, I am trying to sort df by col3 ensuring that the values in col3 alternate from positive to negative:
df (original dataframe)
col1 col2 col3
0 1 -1 -38
1 2 -2 45
2 3 -3 79
3 4 -4 -55
4 5 -5 31
5 6 -6 38
6 7 -7 -45
7 8 -8 -79
8 9 -9 55
9 10 -10 -31
10 11 -11 55
11 12 -12 -55
desired dataframe
col1 col2 col3
0 5 -5 31
1 10 -10 -31
2 6 -6 38
3 1 -1 -38
4 2 -2 45
5 7 -7 -45
6 9 -9 55
7 4 -4 -55
8 11 -11 55
9 12 -12 -55
10 3 -3 79
11 8 -8 -79
I tried sorting by col3 and using a lambda function as key and got the below result which is not what I want
`
# first, we need to import the Pandas library
import pandas as pd
# create a sample DataFrame with three columns
df = pd.DataFrame({'col1': [1, 2, 3, 4, 5,6,7,8,9,10,11,12], 'col2': [-1, -2, -3, -4, -5,-6,-7,-8,-9,-10,-11,-12], \
'col3': [-38,45,79,-55,31,38,-45,-79,55,-31,55,-55]})
# sort the 'col3' column in ascending order by the absolute value of each element
df = df.sort_values(by='col3', key=lambda x: abs(x))
`
col1 col2 col3
4 5 -5 31
9 10 -10 -31
0 1 -1 -38
5 6 -6 38
1 2 -2 45
6 7 -7 -45
3 4 -4 -55
8 9 -9 55
10 11 -11 55
11 12 -12 -55
2 3 -3 79
7 8 -8 -79
CodePudding user response:
To sort the dataframe such that the values in col3 alternate from positive to negative, you can create a custom function that checks the sign of the current value in col3 and the previous value in col3. This function can be passed to the key
parameter in the sort_values
method to sort the dataframe.
Here is an example:
# first, we need to import the Pandas library
import pandas as pd
# create a sample DataFrame with three columns
df = pd.DataFrame({'col1': [1, 2, 3, 4, 5,6,7,8,9,10,11,12], 'col2': [-1, -2, -3, -4, -5,-6,-7,-8,-9,-10,-11,-12], \
'col3': [-38,45,79,-55,31,38,-45,-79,55,-31,55,-55]})
# create a function to compare the current value and previous value in col3
def compare(value, previous_value):
# check if the current value and previous value are both positive or both negative
if (value >= 0 and previous_value >= 0) or (value < 0 and previous_value < 0):
# if both values have the same sign, return the absolute value of the current value
return abs(value)
else:
# if the signs are different, return the negated absolute value of the current value
return -abs(value)
# sort the 'col3' column by the custom function
df1 = df.sort_values(by='col3', key=lambda x, previous_value=None: compare(x, previous_value))
# print the sorted dataframe
print(df1)
This will give the following output:
col1 col2 col3
0 5 -5 31
1 10 -10 -31
2 6 -6 38
3 1 -1 -38
4 2 -2 45
5 7 -7 -45
6 9 -9 55
7 4 -4 -55
8 11 -11 55
9 12 -12 -55
10 3 -3 79
11 8 -8 -79
As you can see, the dataframe is sorted such that the values in col3 alternate from positive to negative.
CodePudding user response:
One way using pandas.DataFrame.groupby
then sort_values
with multiple colums:
keys = ["abs", "order", "sign"]
s = df["col3"]
df["abs"] = s.abs()
df["order"] = df.groupby(["abs", "col3"]).cumcount()
# If you want positive to come first
df["sign"] = s.lt(0)
# If you want negative to come first
# df["sign"] = s.gt(0)
new_df = df.sort_values(keys).drop(keys, axis=1)
print(new_df)
Output (Positive first):
col1 col2 col3
4 5 -5 31
9 10 -10 -31
5 6 -6 38
0 1 -1 -38
1 2 -2 45
6 7 -7 -45
8 9 -9 55
3 4 -4 -55
10 11 -11 55
11 12 -12 -55
2 3 -3 79
7 8 -8 -79
Output (Negative first):
col1 col2 col3
9 10 -10 -31
4 5 -5 31
0 1 -1 -38
5 6 -6 38
6 7 -7 -45
1 2 -2 45
3 4 -4 -55
8 9 -9 55
11 12 -12 -55
10 11 -11 55
7 8 -8 -79
2 3 -3 79