I'm reading a CSV file that have two columns: continent, countries. The cells are empty under each continents.
Continent | Countries
-----------------------------
Asia | China
| Japan
| Korea
North Am. | Canada
| U.S.
but I would like to merge these two columns under Continent column and have an indentation.
Continent
-----------
Asia
China
Japan
Korea
North America
Canada
U.S.
How can I do this using python?
CodePudding user response:
The code below should do the trick. You don't even need to use pandas to read a CSV as Python have already a standard module for that called csv. The code outputs to another CSV called output.csv
.
import csv
continent = ''
with open('data.csv') as csv_in, open('output.csv', 'w') as csv_out:
reader = csv.reader(csv_in)
writer = csv.writer(csv_out)
headers = next(reader, None) # skip the headers
writer.writerow([headers[0]])
for row in reader:
prev_continent = continent
continent = row[0].strip() or continent
if prev_continent != continent:
writer.writerow([continent])
writer.writerow([f' {row[1]}'])
output.csv
Continent
Asia
China
Japan
Korea
North Am.
Canada
U.S.
data.csv
Continent,Countries
Asia ,China
,Japan
,Korea
North Am.,Canada
,U.S.