Home > Software engineering >  How to sort multiple lists while using datetime
How to sort multiple lists while using datetime

Time:01-08

I need to sort app reviews in my csv file by date to later use to make scatter graphs using matplotlib.

My current code is:

def example():
    date = []
    positive = []
    negative = []
    ratings = []
    for row in app_data:
        date.append(row[0])
        positive.append(row[3])
        negative.append(row[2])
        ratings.append(row[4])
    final = sorted(zip(date, negative, positive, ratings), reverse=True)
    final.sort(key=lambda date: datetime.datetime.strptime(date, "%m/%d/%Y"))
    print(final)    `

The problem is I can only sort the dates into the right order using datetime but I need all the lists to be in the same order as to not mix up the reviews.

There is extra data in my csv file after the columns i've read in if that impacts the answer at all.

It might also be worth mentioning that some reviews have the same date too.

an example of my data would be:

DATE,NAME,NEGATIVE REVIEW,POSITIVE REVIEW,RATING GIVEN
03/10/2005,teams,No Negative, very easy to use,8
03/10/2005,skype,i hate this app, No Positive,2.5
12/26/2005,skype,hard to navigate initially, easy to use once you learn the layout,6
07/10/2006,instagram,this app ruined my life,No Positive,1.5

thank you.

CodePudding user response:

This might help

import pandas as pd
rows = [['02/04/2022', 'good','not good', 4.2],
        ['02/03/2022', 'good','not good', 4.2],
        ['01/01/2022', 'good','not good', 4.2],
        ['04/04/2022', 'good','not good', 4.2],
        ['02/03/2022', 'good','bad', 4.2],
        ['02/03/2022', 'very Good','not good', 4.2],
        ['02/03/2022', 'Best','not good', 4.2],]

df = pd.DataFrame(rows)
#use df.sort_values(column) for single column and use list of columns name for sorting on multiple columns
#0 and 1 are columns name
df = df.sort_values([0,1,2])

If you want to sort your data on multiple columns, the second column will work only if there is repetition in the first column. In the above code, everything will be sorted on column 0, If the values are the same in the column 0, then column 1 will sort the values. If the values are the same in both column 0 and 1, then column 2 will work.

EDIT this will work for your data as well

CodePudding user response:

You can use the sorted() function and specify a key function that takes an inner list as input and returns the first element of the inner list.

data = [['02/04/2022', 'good','not good', 4.2],
        ['02/03/2022', 'good','not good', 4.2],
        ['01/01/2022', 'good','not good', 4.2],
        ['04/04/2022', 'good','not good', 4.2],
        ['02/03/2022', 'good','bad', 4.2],
        ['02/03/2022', 'very Good','not good', 4.2],
        ['02/03/2022', 'Best','not good', 4.2],]

sorted_data = sorted(data, key=lambda x: x[0])

This will sort the list in ascending order

[['01/01/2022', 'good','not good', 4.2],
 ['02/03/2022', 'good','not good', 4.2],
 ['02/03/2022', 'good','bad', 4.2],
 ['02/03/2022', 'very Good','not good', 4.2],
 ['02/03/2022', 'Best','not good', 4.2],
 ['02/04/2022', 'good','not good', 4.2],
 ['04/04/2022', 'good','not good', 4.2]]

After this, you can separate each piece of information into a separate list using list comprehension

dates = [x[0] for x in data]
first_rating = [x[1] for x in data]
second_rating = [x[2] for x in data]
scores = [x[3] for x in data]

CodePudding user response:

You can use datetime module to make comparison along dates. datetime.strptime converts string data to a datetime object. For sorting something by another operation, there is a parameter called key. Here is a simple way to do this:

from datetime import datetime
data = [['02/04/2022', 'good','not good', 4.2],
        ['02/03/2022', 'good','not good', 4.2],
        ['01/01/2022', 'good','not good', 4.2],
        ['04/04/2022', 'good','not good', 4.2],
        ['02/03/2022', 'good','bad', 4.2],
        ['02/03/2022', 'very Good','not good', 4.2],
        ['02/03/2022', 'Best','not good', 4.2],]
sorted_data = sorted(data, key=lambda x: datetime.strptime(x[0],"%d/%m/%Y"))
print(*sorted_data,sep="\n")

Also you can change date format by changing this part:

"%d/%m/%Y"

  • Related