My example excel sheet looks like this:
Excel sheet data:
customer1_data.xlsx =
parameter customer1
analysis 1
analysis_name 1month_services
analysis_duration [2022-08-23, 2022-11-02]
analysis_numcheck 1
analysis_dupcolumns 1
Import excel sheet data as dataframe It looks normal but when I query individual rows or cells, some cell values have quotes at the end. I don't want any quotes in the end.
c1df = pd.read_excel('customer1_data.xlsx')
c1df.set_index('parameter',inplace=True)
print(c1df)
parameter customer1
analysis 1
analysis_name 1month_services
analysis_duration [2022-08-23, 2022-11-02]
analysis_numcheck 1
analysis_dupcolumns 1
Present output When I print individual cell values
print(c1df.loc['analysis'])
1
print(c1df.loc['analysis_duration'])
'[2022-08-23, 2022-11-02]'
print(c1df.loc['analysis_name'])
'1month_services'
Expected output:
print(c1df.loc['analysis'])
1
print(c1df.loc['analysis_duration'])
# I don't want any quotes at the end for the list here
[2022-08-23, 2022-11-02]
print(c1df.loc['analysis_name'])
# ' ' quote is expected for the string, no issues here
'1month_services'
CodePudding user response:
You can use pandas.Series.split
to convert string delimited to lists :
c1df["customer1"]= (
c1df["customer1"].str.strip("[]")
.str.split(",")
.where(c1df["customer1"].str.contains("[\[\]]", regex=True, na=False))
.fillna(c1df["customer1"])
)
# Output :
print(c1df)
parameter customer1
0 analysis 1
1 analysis_name 1month_services
2 analysis_duration [2022-08-23, 2022-11-02]
3 analysis_numcheck 1
4 analysis_dupcolumns 1
print(c1df.iloc[2,1])
['2022-08-23', ' 2022-11-02']