Home > Blockchain >  Extracting a single value based on values in the same row (pandas)
Extracting a single value based on values in the same row (pandas)

Time:11-12

I have a pandas DataFrame where I am trying to write a code that accepts user inputs pertaining to a Project Type and a Capacity and will return the 'Days' value IF the inputted capacity falls within the Min - Max range and matches the Project Type.

df = pd.DataFrame({'Type': ['Wind - Onshore', 'Wind - Onshore', 'Wind - Onshore', 'Wind - Offshore', 'Wind - Offshore','Wind - Offshore', 'Solar PV', 'Solar PV', 'Solar PV'],
               'Min': [0.0, 5.0, 10.0, 0.0, 5.0, 10.0, 0.5, 1.0, 2.5],
               'Max': [4.9990,9.9990, 19.9990, 4.9990, 9.9990, 19.9990, 0.9990, 2.4999, 4.9990],
               'Days': [189.643564, 200.380952, 297.146154, 331.666667, 121.500000, 154.000000, 171.711956, 185.362637, 194.635246]})

df

The user input will look like this:

print('t1 = Wind - Onshore\nt2 = Wind - Offshore\nt3 = Solar PV\n')

t1 = 'Wind - Onshore'
t2 = 'Wind - Offshore'
t3 = 'Solar PV'
type = input('Enter Project Type:')
cap = float(input('Enter Capacity:'))

For example, if the user enters t1 for the Project Type and 3 for the Capacity, the code should return 189.643564 because it falls between the Min and Max of the corresponding Type.

All of my attempts with using for loops/if statements have been unsuccessful. I am novice and would appreciate if anyone could show me an efficient and reproducible code to complete this task. Thanks!

CodePudding user response:

You should change float(input('Enter Capacity:')) to int(...), then...

opts = {
    't1': 'Wind - Onshore',
    't2': 'Wind - Offshore',
    't3': 'Solar PV',
}

type = 'Wind - Offshore'
cap = 2
row = df[df['Type'] == opts[type]].iloc[cap]
print(row)

Output:

Type    Wind - Offshore
Min                10.0
Max              19.999
Days              154.0
Name: 5, dtype: object

Basically what that does, is it creates a mapping from tXX to the actual display name of the type, and then it gets all the rows of the df that have that type, then it gets the row from that selection indexed by cap.

You can access the values from that row like this:

print(row['Min'], row['Max'], row['Days'])

Output:

10.0 19.999 154.0

CodePudding user response:

Instead of if statements you can create a dictionary, types_dict, that maps the type abbreviation (user_type: 't1', 't2' or 't3') to the corresponding type.

You can write the two conditions

  1. df.Type == types_dict[user_type]
  2. df.Min <= user_cap <= df.Max

as a single query based on the user input values. Then pass it to DataFrame.query to select the row that fulfils the conditions. Finally, select only the 'Days' value.

df = pd.DataFrame({'Type': ['Wind - Onshore', 'Wind - Onshore', 'Wind - Onshore', 'Wind - Offshore', 'Wind - Offshore','Wind - Offshore', 'Solar PV', 'Solar PV', 'Solar PV'],
               'Min': [0.0, 5.0, 10.0, 0.0, 5.0, 10.0, 0.5, 1.0, 2.5],
               'Max': [4.9990,9.9990, 19.9990, 4.9990, 9.9990, 19.9990, 0.9990, 2.4999, 4.9990],
               'Days': [189.643564, 200.380952, 297.146154, 331.666667, 121.500000, 154.000000, 171.711956, 185.362637, 194.635246]})

print('t1 = Wind - Onshore\nt2 = Wind - Offshore\nt3 = Solar PV\n')

# map type abbreviation to the correct type 
types_dict = {
    't1': 'Wind - Onshore', 
    't2': 'Wind - Offshore',
    't3': 'Solar PV'
}

user_type = input('Enter Project Type: ')
user_cap = float(input('Enter Capacity: '))

# condition to fulfil. 
query = f"Type == '{types_dict[user_type]}' and Min <= {user_cap} <= Max"

# get the 'Days' of the row that satisfy the previous condition 
days = df.query(query)['Days'].iat[0]

print("\nDays:", days)

Output

t1 = Wind - Onshore
t2 = Wind - Offshore
t3 = Solar PV

Enter Project Type: t1
Enter Capacity: 3

Days: 189.643564
  • Related