How to print out from two different columns the min and max in a csv file?-CodePudding

I want to write a program that evaluates data about a construction site operation from an ASCII table in CSV format file. Know I would like to print out the qualification that has the highest cost and the qualification that has the lowest cost. It should print out the Qualification name and the cost.

The template file is an Excel file:

name          Qualification       Month        costs
Max Mustermann  Seller            Dec 18      6.155,39
Max Mustermann  Seller            Jan 19      5.069,15
Max Mustermann  Seller            Dec 18      362,08
Klee klumper    Seller            Jan 19      4.637,65
Klee klumper    Seller             Mar19      1.159,41
Koch Schnerider Project Engineer   Jan 19     1.358,28
Koch Schnerider Project Engineer   Jul 19     679,14
Müller Manim    Distribution       Sep 19     15.149,28
Müller Manim    Distribution       Jan 19     16.743,94
Schach Matt     Site Manager       Sep19      14.399,79
Schach Matt     Site Manager       Jan 19     1.371,41
Zeimetz Kinder  Project Engineer   Jul 19     11.376,50
Zeimetz Kinder  Project Engineer   Jan 19     2.133,09

In the end it should look like this:

Min. Cost is = Seller 362,08
Max. Cost is = Distribution 16.743,94

I have the minimum and maximum value out, but how do I get it out to which qualification it belongs?

import pandas as pd
import os

filename = "site_operation.csv"
path = "."
file = os.path.join(filename, path)
tscv1 = pd.read_csv(file, sep=";", thousands=".", decimal=",", encoding="ansi")

total_cost = tscv1['costs'].sum()
print("Total costs from all operations: ", total_cost)

minimum = tscv1['costs'].min()
maximum = tscv1['costs'].max()

print("Min. Cost is =", minimum)
print("Min. Cost is =", maximum)

CodePudding user response：

You can use nsmallest and nlargest to return to top/bottom n rows of a dataset.

# These functions return a single-row data frame.
# We only care about the first row
min_row = df.nsmallest(1, 'costs').iloc[0]
max_row = df.nlargest(1, 'costs').iloc[0]

print(f"Min Cost is = {min_row['Qualification']} {min_row['costs']}")
print(f"Max Cost is = {max_row['Qualification']} {max_row['costs']}")