Home > other >  I want to export the date row with the same date from another excel
I want to export the date row with the same date from another excel

Time:08-09

I am new to Python, and I am about to solve some simple questions.

I have two excel datasets. The first one contains the samples I have taken on different days, with a separate "dates" column and a "month" column, as below:

enter image description here

and the other excel, with the date I only wanted

enter image description here

and I want to filter the first excel with the dates that are equal to the second excel, and print out the rows that meet these conditions.

Here is my code: 1st attempt:

import datetime

import pandas
import pandas as pd
import openpyxl
from pandas import DataFrame
from datetime import datetime
import datetime as dt
import numpy as np

df_1=pd.read_excel('waterstationin29UMT.xlsx',sheet_name='Sheet1') #first excel
print(df_1)
df_2=pd.read_excel('29UMTdate.xlsx',sheet_name='Sheet1') #second excel
print(df_2)

minyan_d = df_2.iloc[1:11:,1:4]              #only want the date information
print(type(minyan_d))
# obtain datetime series:
df_2_1=pd.to_datetime(minyan_d['system:time_start'],format='%Y/%m/%d') 
print(type(df_2_1.dt.year))
water_date = df_1['SampDay']
water_month = df_1['SampMonth']
df_1.where((df_1['SampDay'] == df_2_1.dt.day) and (df_1['SampMonth'] == df_2_1.dt.month))
```

and get the bug said that 'Can only compare identically-labeled series objects'

second attempt:
```for x in water_date:
   for y in water_month:
      if [x == (df_2_1.dt.day)] and [y ==(df_2_1.dt.month)]:
         print(1)
      else:
         print(2)

I know some of the data in the first excel can not be found in the second excel, but I do not know why I only get one. I am new to Python. Just forgive the naive problems I have made.

CodePudding user response:

Use merge().

Example:

merged = pd.merge(df_1, df_2, how='inner', on=['SampMonth'])

This will join the rows that have the same entry in the SampMonth column across the dataframes. You can also use how="outer", which results in a different merge.

Read the documentation to learn how to use it for your needs - it is quite helpful.

Or, if you just want to filter one dataframe, you could do this:

df_1 = df_1[df_1["SampMonth" == <insert month here>]]

This will return the rows in the dataframe where the month matches what you want -- you can make the internal boolean a compound boolean, as in (condition 1) & (condition 2) & (condition 3) to make the whole date match.

CodePudding user response:

guys. I think I found the solution, but still being very complicated, I have transferred the first excel's data to day/month/year format, and compare these two via this code:) minmin = df_1['value'].isin(df_2.iloc[0:11,5:8]) which give me False/True. Any efficient and time-saving codes are needed also!

  • Related