I have a csv file with the following data:
List Of Tables DimCurrency DimOrganization DimProduct DimProductCategory DimProductSubcategory DimPromotion
Now, I need to read this data and append it to a list.
But when I try to read the data using the following code
table_col = ['table']
df1 = pd.read_csv('E:/Tabledata.csv', delimiter=',', names=table_col, header=0)
print(df1)
I get the Output as:
table DimCurrency NaN DimOrganization NaN DimProduct NaN DimProductCategory NaN DimProductSubcategory NaN DimPromotion NaN
I tried to remove to NaN values using df.fillna('') method, it gave me the output:
table DimCurrency DimOrganization DimProduct DimProductCategory DimProductSubcategory DimPromotion
And when I try to append this to a list, it returns me:
['', '', '', '', '', '']
Any suggestions how can I resolve this?
CodePudding user response:
Use read_csv
:
>>> pd.read_csv('E:/Tabledata.csv', squeeze=True).tolist()
['DimCurrency',
'DimOrganization',
'DimProduct',
'DimProductCategory',
'DimProductSubcategory',
'DimPromotion']
Or you can simply use:
lines = [line.strip() for line in open('E:/Tabledata.csv').readlines()][1:]
CodePudding user response:
If your goal is simply to read a single column csv file into a list, don' use pandas but pure python:
with open('Tabledata.csv') as f:
for i in range(1): # set here the number of header lines to skip
next(f)
out = list(map(str.strip, f.readlines()))
output:
['DimCurrency',
'DimOrganization',
'DimProduct',
'DimProductCategory',
'DimProductSubcategory',
'DimPromotion']
CodePudding user response:
seems like the content you need is in index column so do this after import data into dataframe.
list(df1.index)
output:
['DimCurrency', 'DimOrganization', 'DimProduct', 'DimProductCategory', 'DimProductSubcategory', 'DimPromotion']