I'm using Polars to handle my excel data with big size. The result dataframe will be written to csv pldf_csv.write_csv("C:\\mydir\\code testings\\result.csv")
. The column 3PL TN
is like a numeric ID for the table, the values of that column are like this 004087782412
. I have make sure that column datatype is in string
And then if I preview the csv from local library, the format is okay too
But when I open the file the column changed automatically
I have also tried to changed the type to Utf8 but to no avail. Is there anything I can do? My data is big and there will be several csv result so I don't want to involve looping and writing it to get the csv result
CodePudding user response:
This isn't a polars problem but an Excel one. For instance if I open notepad and create a csv that looks like:
3PL TN, weight
004092306725, 1
004092306726, 2
and then I open it in Excel then it'll do the same thing with the leading 0s.
This answer tells us that if there's a tab before the value then Excel will treat it as a string rather than a number.
With that info, you can save by doing
pldf_csv.with_column((pl.lit("\t") pl.col('3PL TN')).alias('3PL TN')).write_csv("C:\\mydir\\code testings\\result.csv")
What this does is concat a tab in front of your value. You have to put the (pl.lit("\t") pl.col('3PL TN')
in parenthesis otherwise alias will impact the wrong thing and you'll end up with an extra column.