Home > Enterprise >  Why does printing a list copied from a DataFrame also print the Name and dtype of every row?
Why does printing a list copied from a DataFrame also print the Name and dtype of every row?

Time:05-08

I'm relatively new to Python, and super new to Pandas. I'm trying to cross-reference two DataFrames; finding the people that meet a condition in DataFrame A, and creating a new DataFrame with their name plus their "value" from DataFrame B. I've got the main logic working, but for some reason when I go to print the DataFrame I created that contains that person's name and value, it also prints data about the person's name and value like so:

 [277    Joe Smith
 Name: Name, dtype: object, 277    0.7
 Name: Value, dtype: float64], [207    Steve Smith
 Name: Name, dtype: object, 207    0.6
 Name: Value, dtype: float64]

To be honest, I'm not even sure what the 277 and 207 numbers represent. What I'm expecting is something like:

 Joe Smith  0.7
 Steve Smith 0.6

The way I'm creating the DataFrame is:

for index, row in myplayers.iterrows():
    playerrow = allplayers['playerID'] == row['playerID']
    playerlist.append([allplayers.loc[playerrow,'Name'],allplayers.loc[playerrow,'Value']])

Where myplayers contains the playerID of all the players I'm interested in, and allplayers is the DataFrame containing the player names, IDs, and values.

At the end, I print with:

print(playerlist)

I've seen suggestions to do things like print(playerlist.to_string()) but the same things seems to happen. Is it a problem with the way I'm adding items? Or is there a better way to print a list like that?

Thanks!

CodePudding user response:

I would recommend using the copy and reindex() method:

playerDataFrame = dataFrame2.copy()
player_names = ['Joe Smith', 'Steve Smith']
playerDataFrame.reindex(player_names]

To print the contents of the dataframe, use the head() method:

playerDataFrame.head()

CodePudding user response:

You already got your answer for a correct output below, but for your other question (why the additional info is getting ouput) -> iterrows() returns a SERIES for each row, because you iterate over DataFrame rows as (index, Series) pairs.

CodePudding user response:

[...] is there a better way [...]

Why deal with lists and not just do

allplayers.set_index('playerID').loc[
    myplayers.playerID,
    ['Name', 'Value']
]

?

  • Related