I have a csv file with 100 records. I want to write the first 50 records in a new csv file i.e 'newFile.csv' in the first iteration. In the second iteration, I want to write the next 50 records in the 'newFile.csv' file after reading the next 50 records from the original csv file.
I am able to perform the first Iteration but unable to perform the second iteration with the expected values as the next 50 rows that has to be written in the csv file. Can someone please help me out in this?? Thank you
Here is the code
import pandas as pd
oldData = pd.read_csv('oldFile.csv') # Has 100 rows
for i in range(2):
newData = pd.read_csv('oldFile.csv', nrows=50) # Has 50 rows
newCsv = newData.to_csv('newFile.csv', index=False)
newData = newData.iloc[50:] # Removes those 50 rows
CodePudding user response:
import pandas as pd
oldData = pd.read_csv('oldFile.csv') # Has 100 rows
for newData in pd.read_csv('oldFile.csv', chunksize=50) # Has 50 rows:
newCsv = newData.to_csv('newFile.csv', index=False)
newData = newData.iloc[50:] # Removes those 50 rows
In this way each time you read the .csv file it contains 50 rows. The first iteration the first 50 rows, the second one the rows from 51 to 101, and so on.
CodePudding user response:
You can read the oldFile.csv
in chunks of 50 rows and then process each chunk individually, e.g.,
import pandas as pd
nRows=50
with pd.read_csv('oldFile.csv', chunksize=nRows, header=None) as reader:
for chunk in reader:
print(chunk)
chunk.to_csv('newFile.csv', index=False, header=None)
Note that newFile.csv
is being overwritten on each iteration.