Home > Enterprise >  Writing pandas column to csv without merging integers
Writing pandas column to csv without merging integers

Time:10-11

I have extracted user_id against shop_ids as pandas dataframe from database using SQL query.

    user_id     shop_ids
0   022221205   541
1   023093087   5088,4460,4460,4460,4460,4460,4460,4460,5090
2   023096023   2053,2053,2053,2053,2053,2053,2053,2053,2053,1...
3   023096446   4339,4339,3966,4339,4339
4   023098684   5004,3604,5004,5749,5004

I am trying to write this dataframe into csv using: df.to_csv('users_ordered_shops.csv')

I end up with the csv merging the shop ids into one number as such:

    user_id     shop_ids
0   22221205    541
1   23093087    508,844,604,460,446,000,000,000,000,000,000,000
2   23096023    2,053,205,320,532,050,000,000,000,000,000,000,000,000,000,000,000,000
3   23096446    43,394,339,396,643,300,000
4   23098684    50,043,604,500,457,400,000

The values for index 2 are:

print(df.iloc[2].shop_ids)
2053,2053,2053,2053,2053,2053,2053,2053,2053,1294,1294,2053,1922

Expected output is a csv file with all shop_ids intact in one column or different columns like:

    user_id     shop_ids
0   022221205   541
1   023093087   5088,4460,4460,4460,4460,4460,4460,4460,5090
2   023096023   2053,2053,2053,2053,2053,2053,2053,2053,2053,1294,1294,2053,1922
3   023096446   4339,4339,3966,4339,4339
4   023098684   5004,3604,5004,5749,5004

Any tips on how to get the shop ids without merging when writing to a csv file? I have tried converting the shop_ids column using astype() to int and str which has resulted in the same output.

CodePudding user response:

Update

To get one shop per column (and remove duplicates), you can use:

pd.concat([df['user_id'],
           df['shop_ids'].apply(lambda x: sorted(set(x.split(','))))
                         .apply(pd.Series)],
          axis=1).to_csv('users_ordered_shops.csv', index=False)

Change the delimiter. Try:

df.to_csv('users_ordered_shops.csv', sep=';')

Or change the quoting strategy:

import csv

df.to_csv('users_ordered_shops.csv', quoting=csv.QUOTE_NONNUMERIC)
  • Related