I am using a code for a query, sometimes the input goes and there is no return (basically it does not find anything so the return is an empty row) so it is empty. However, when I use pd.concat
, those empty rows disappear. Is there a way to keep these no return rows in the loop as well so that when I use that I can have empty rows on the final output.csv
?
import numpy as np
import pandas as pd
from dl import authClient as ac, queryClient as qc
from dl.helpers.utils import convert
import openpyxl as xl
wb = xl.load_workbook('/Users/somethingfile.xlsx')
sheet = wb['Sheet 1']
df = pd.DataFrame([],columns = ['col1','col2',...,'coln'])
for row in range(3, sheet.max_row 1):
a0, b0, r = sheet.cell(row,1).value, sheet.cell(row,2).value, 0.001
query = """
SELECT a,b,c,d,e FROM smthng
WHERE q3c_radial_query(a,b,{:f},{:f},{:f}) LIMIT 1
""".format(a0,b0,r)
response = qc.query(sql=query,format='csv')
temp_df = convert(response,'pandas')
df = pd.concat([df,temp_df])
df.to_csv('output.csv')
CodePudding user response:
So, as far as I understand, the problem is that when "temp_df" is empty you want to add a blank row. You should be able to do that using the .append() method, appending an empty Series().
if len(temp_df) == 0:
temp_df=temp_df.append(pd.Series(), ignore_index=True)
#Then concat...
CodePudding user response:
For your specific question, it works if you check if temp_df
is empty or not in each step and make it a DataFrame of NaNs if it is empty.
Another note on the implementation is that, concatenating in each iteration will be a very expensive operation. It is much faster if you store the temp_df
s in each iteration in a list and concatenate once after the loop is over.
lst = [] # <-- empty list to fill later
for row in range(3, sheet.max_row 1):
a0, b0, r = sheet.cell(row,1).value, sheet.cell(row,2).value, 0.001
query = """
SELECT a,b,c,d,e FROM smthng
WHERE q3c_radial_query(a,b,{:f},{:f},{:f}) LIMIT 1
""".format(a0,b0,r)
response = qc.query(sql=query,format='csv')
temp_df = convert(response,'pandas')
if temp_df.empty:
temp_df = pd.DataFrame([np.nan])
lst.append(temp_df)
df = pd.concat(lst) # <-- concat once
df.to_csv('output.csv', index=False)