I am trying this pandas coding problem but don't know how to do it. the code below is what I could do but it's showing error. can anybody help me and give me an explanation on how to do it? Thank You.
The Cars
dataset has three columns giving the quality, machining angle, and machining speed of 18 cars.
Write a program that performs the following tasks:
Load the data
Cars.csv
into a data frame calledcars_df
.Subset the first
userNum
rows of the data frame into a new data frame.Find and print the maximum values of each column in the subset.
Ex: If the input is:
5
the output is:
Quality 5
Speed 4
Angle 3
dtype: int64
my code:
import pandas as pd
cars_df = pd.DataFrame("Cars.csv")# Import the CSV file Cars.csv
userNum = int(input())
# Subset the first userNum rows of the data frame
userNum = pd.iloc[:0,:]
userNum.max()
print(userNum)
Traceback (most recent call last):
File "main.py", line 4, in <module>
cars_df = pd.DataFrame("Cars.csv")# Import the CSV file Cars.csv
File "/usr/local/lib/python3.8/dist-packages/pandas/core/frame.py", line 730, in __init__
raise ValueError("DataFrame constructor not properly called!")
ValueError: DataFrame constructor not properly called!
I am a beginner in python and most of the time I am very clueless about the coding. I don't know how to do this code. your help and the explanation for this coding will be appreciated.
CodePudding user response:
You didn't read in the CSV correctly.
import pandas as pd
cars_df = pd.read_csv("Cars.csv")
Also a good starting point whenever you are stuck is to refer to the documentation of whatever library you are using.
For example, if you look up pd.DataFrame() on your browser you'll find this page. There you can see what types of inputs the parameter of a function takes.
So to help get you unstuck, once you read in the csv that becomes a dataframe. Dataframes are how you directly interact with tabular data.
cars_df[:numRows] #-> gets the first numRows of the cars dataframe.
#using the colon is referred to as slicing.
Once again if you go the pandas documentation and type in iloc, you'll get a link to a page about pandas.Dataframe.iloc. The dot notation is how you access properties of an object. The first dot is usually just being explicit in how we access the library or namespace, we use it in the event that different libraries have similar functions that share a name.
So given a dataframe, pd.Dataframe, the .iloc denotes the interger location we want to find. The documentation for .iloc gives good examples of how to access specific rows.
All in all, the best way to get unstuck is to look for examples online and see if the code documentation includes helpful examples.
CodePudding user response:
I think this is what you are after
cars_df = pd.DataFrame("Cars.csv") # Import the CSV file Cars.csv
user_num = int(input())
subset = cars_df.head(user_num)
print(subset.max())
Instead of the last line, you could call
print(subset.describe())
Which would give you a lot more stats than just .max()