I have a CSV file like this (comma separated)
ID, Name,Context, Location
123,"John","{\"Organization\":{\"Id\":12345,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}","Road 1"
234,"Mike","{\"Organization\":{\"Id\":23456,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}","Road 2"
I want to create DataFrame like this:
ID | Name |Context |Location
123| John |{\"Organization\":{\"Id\":12345,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}|Road 1
234| Mike |{\"Organization\":{\"Id\":23456,\"IsDefault\":false},\"VersionNumber\":-1,\"NewVersionId\":\"88229ef9-e97b-4b88-8eba-31740d48fd15\",\"ApiIntegrationType\":0,\"PortalIntegrationType\":0}|Road 2
Could you help and show me how to use pandas read_csv doing it?
CodePudding user response:
An answer - if you are willing to accept that the \
char gets stripped:
pd.read_csv(your_filepath, escapechar='\\')
ID Name Context Location
0 123 John {"Organization":{"Id":12345,"IsDefault":false}... Road 1
1 234 Mike {"Organization":{"Id":23456,"IsDefault":false}... Road 2
An answer if you actually want the backslashes in - using a custom converter:
def backslash_it(x):
return x.replace('"','\\"')
pd.read_csv(your_filepath, escapechar='\\', converters={'Context': backslash_it})
ID Name Context Location
0 123 John {\"Organization\":{\"Id\":12345,\"IsDefault\":... Road 1
1 234 Mike {\"Organization\":{\"Id\":23456,\"IsDefault\":... Road 2
escapechar
on read_csv
is used to actually read the csv
then the custom converter puts the backslashes back in.
Note that I tweaked the header row to make the column name match easier:
ID,Name,Context,Location