Home > Back-end >  Pandas how to read CSV and update specific cells that are currently blank
Pandas how to read CSV and update specific cells that are currently blank

Time:10-01

I have a CSV File that is blank in row 1 for Columns A: D. I need the file to populate the header labels.

Is there a simple way to insert values for Row 1 only for columns A, B, C, and D? Columns E through Z already have header values, it is just columns A, B, C, and D I need to force-populate.

Code I"m trying to populate the values as expected, but also seems to add a row that displays numbers for each column 1, 2,3,4,5 etc.. and forces all of the correct labels to the second row.

gg = pd.read_csv(r'\\path\filename.csv')
gg.columns[0:4] = ["Date", "Type", "SubType1", "SubType2"]

Edit , here's sample data, that hopefully shows that the csv is blank in row 1 for columns A:D (this is when I print(gg)

 Unnamed:0   Unnamed:1 Unnamed:2  Unnamed:3   Name1             Name2          Name1     Name2
    Date      Type    Subtype Subtype2         Metric 1            metric1      metric2  metric2
    9/8/2022  Vision  LS-54   LS                  54               .0234234   .923423    .83
    9/8/2022  Vision  LS-55   LS                  55               .023234    .23423     .23
    9/8/2022  Storm   LS-54   LS                  54               .0234234   .923423    .534
    9/8/2022  Storm   LS-55   LS                  55               .023234    .23423     .343

Here's what I want it to display:

Date     Type     SubType  SubType2             Name2          Name1     Name1       Name2
Date      Type    Subtype Subtype2         Metric 1            metric1     metric2  metric2
9/8/2022  Vision  LS-54   LS                  54               .0234234   .923423    .83
9/8/2022  Vision  LS-55   LS                  55               .023234    .23423     .23
9/8/2022  Storm   LS-54   LS                  54               .0234234   .923423    .534
9/8/2022  Storm   LS-55   LS                  55               .023234    .23423     .343

The current code does not work, but seeking to force those cells to populate the values while leaving the rest of the csv as is if that makes sense.

CodePudding user response:

Solution1:

You are on right track... it's just column names of a data frame needed to change for all of them in one go; So try something like this...

gg.columns = ["Date", "Type", "SubType1", "SubType2"]   list(gg.columns[4:])

Solution2:

In case, if you don't wish to rename the column names via indexing then this is another way, where we don't need to know where are these columns...

gg.rename(columns={'Unnamed: 0':'Date',
                   'Unnamed: 1':'Type',
                   'Unnamed: 2':'SubType1',
                   'Unnamed: 3':'SubType2'}, inplace=True)

Another Problem Statement:

How to deal with duplicated column name while loading csv file using pandas?

Answer: If you see this link (enter image description here Hope this Helps...

  • Related