Home > Net >  Data Frame, Subset the user name and print the max and print the max value of each column in the sub
Data Frame, Subset the user name and print the max and print the max value of each column in the sub

Time:12-05

I am trying this pandas coding problem but don't know how to do it. the code below is what I could do but it's showing error. can anybody help me and give me an explanation on how to do it? Thank You.

The Cars dataset has three columns giving the quality, machining angle, and machining speed of 18 cars.

Write a program that performs the following tasks:

  1. Load the data Cars.csv into a data frame called cars_df.

  2. Subset the first userNum rows of the data frame into a new data frame.

  3. Find and print the maximum values of each column in the subset.

Ex: If the input is:
5
the output is:

Quality    5
Speed      4
Angle      3
dtype: int64

my code:

import pandas as pd


cars_df = pd.DataFrame("Cars.csv")# Import the CSV file Cars.csv

userNum = int(input())

# Subset the first userNum rows of the data frame
userNum = pd.iloc[:0,:]
userNum.max()
print(userNum)
Traceback (most recent call last):
  File "main.py", line 4, in <module>
    cars_df = pd.DataFrame("Cars.csv")# Import the CSV file Cars.csv
  File "/usr/local/lib/python3.8/dist-packages/pandas/core/frame.py", line 730, in __init__
    raise ValueError("DataFrame constructor not properly called!")
ValueError: DataFrame constructor not properly called!

I am a beginner in python and most of the time I am very clueless about the coding. I don't know how to do this code. your help and the explanation for this coding will be appreciated.

CodePudding user response:

You didn't read in the CSV correctly.

import pandas as pd


cars_df = pd.read_csv("Cars.csv")

Also a good starting point whenever you are stuck is to refer to the documentation of whatever library you are using.

For example, if you look up pd.DataFrame() on your browser you'll find this page. There you can see what types of inputs the parameter of a function takes.

So to help get you unstuck, once you read in the csv that becomes a dataframe. Dataframes are how you directly interact with tabular data.

cars_df[:numRows] #-> gets the first numRows of the cars dataframe.
#using the colon is referred to as slicing.

Once again if you go the pandas documentation and type in iloc, you'll get a link to a page about pandas.Dataframe.iloc. The dot notation is how you access properties of an object. The first dot is usually just being explicit in how we access the library or namespace, we use it in the event that different libraries have similar functions that share a name.

So given a dataframe, pd.Dataframe, the .iloc denotes the interger location we want to find. The documentation for .iloc gives good examples of how to access specific rows.

All in all, the best way to get unstuck is to look for examples online and see if the code documentation includes helpful examples.

CodePudding user response:

I think this is what you are after

cars_df = pd.DataFrame("Cars.csv") # Import the CSV file Cars.csv

user_num = int(input())

subset = cars_df.head(user_num)

print(subset.max())

Instead of the last line, you could call

print(subset.describe())

Which would give you a lot more stats than just .max()

  • Related