Home > Software engineering >  Custom Filtering and Sorting
Custom Filtering and Sorting

Time:10-08

I have a DataFrame called Planes. Here it is:

            model     year    range  seating    price  length  wingspan
0    A300-600   1972.0   7500.0      345  Unknown   54.10     44.84
1    A310-300  Unknown   8050.0      280  Unknown   46.66     43.90
2     A320neo   2014.0   6300.0      194    110.0   37.57     35.80
3     A321neo   2014.0   7400.0      244    129.0   44.51     35.80
4    A330-200   1997.0  13450.0      406    238.5   58.82     60.30
5    A330-300   1992.0  11750.0      440    264.0   63.66     60.30
6    A330-800   2018.0  15094.0      406    259.0   58.82     64.00
7    A330-900   2017.0  13334.0      440    317.0   63.66     64.00
8    A340-200   1993.0  12400.0      420  Unknown   59.40     60.30
9    A340-300   1991.0  13500.0      440  Unknown   63.69     60.30
10   A340-500   1997.0  16670.0      440  Unknown   67.93     63.45
11   A340-600   2001.0  14450.0      475    275.0   75.36     63.45
12   A350-900   2013.0  15000.0      440    317.4   66.80     64.75
13  A350-1000   2016.0  16100.0      440    166.5   73.79     64.75
14       A380   2005.0  14800.0      853    445.6   72.70     79.80
15    737-800   1997.0   5436.0      175     82.0   39.47     35.79
16    747-400   1989.0  15000.0      624    267.0   71.00     64.00
17    757-200   1978.0   7222.0      234     50.0   47.32     38.05
18    757-300   1996.0   6287.0      280     80.0   54.47     38.05
19    767-200   1978.0  Unknown      290    128.0   48.50     47.60
20    767-300   1994.0   7300.0      351    152.0   54.90     47.60
21  777-300ER   2004.0  14685.0      550  Unknown   73.90     64.80
22    787-900   2009.0  15750.0      290    264.6   63.00     62.00

I just want to show the model and seating columns and I want the seating column to be greater than 350 and sorted. How can I do this in code?

CodePudding user response:

You can use loc and sort_values:

(df.loc[df['seating'] > 350, ['model', 'seating']]
   .sort_values('seating', ignore_index=True)
)

Output:

          model   seating
0       767-300       351
1      A330-200       406
2      A330-800       406
3      A340-200       420
4      A330-300       440
5      A330-900       440
6      A340-300       440
7      A340-500       440
8      A350-900       440
9     A350-1000       440
10     A340-600       475
11    777-300ER       550
12      747-400       624
13         A380       853

CodePudding user response:

Try this :

df = ((df.loc[df["seating"] > 300])[["model", "seating"]]).sort_values("seating")

df["seating"] > 300 is pretty obvious, selects only rows where seating is higher than 300

[["model", "seating"]] selects only columns model and seating

.sort_values("seating") sorts the resulting dataframe by the column passed in parameter (here seating)

  • Related