I have my dataframe which I want to add to EntitySet:
Unnamed: 0 Year name Pos Age Tm G GS \
24672 24672 2017.0 Troy Williams SF 22.0 TOT 30.0 16.0
24675 24675 2017.0 Kyle Wiltjer PF 24.0 HOU 14.0 0.0
24688 24688 2017.0 Stephen Zimmerman C 20.0 ORL 19.0 0.0
24689 24689 2017.0 Paul Zipser SF 22.0 CHI 44.0 18.0
24690 24690 2017.0 Ivica Zubac C 19.0 LAL 38.0 11.0
MP PER ... FT% ORB DRB TRB AST STL BLK TOV \
24672 557.0 8.9 ... 0.656 15.0 54.0 69.0 25.0 27.0 10.0 33.0
24675 44.0 6.7 ... 0.500 4.0 6.0 10.0 2.0 3.0 1.0 5.0
24688 108.0 7.3 ... 0.600 11.0 24.0 35.0 4.0 2.0 5.0 3.0
24689 843.0 6.9 ... 0.775 15.0 110.0 125.0 36.0 15.0 16.0 40.0
24690 609.0 17.0 ... 0.653 41.0 118.0 159.0 30.0 14.0 33.0 30.0
PF PTS
24672 60.0 185.0
24675 4.0 13.0
24688 17.0 23.0
24689 78.0 240.0
24690 66.0 284.0
When I try to add dataframe to featuretools EntitySet, like this:
entity_set.add_dataframe(dataframe_name="season_stats",
dataframe=season_stats,
index='name'
)
I receive such an error:
/usr/local/lib/python3.8/dist-packages/woodwork/table_accessor.py in _check_index(dataframe, index)
1694
1695 if dataframe[index].isnull().any():
-> 1696 raise IndexError("Index contains null values")
1697
1698
IndexError: Index contains null values
What I'm doing wrong?
CodePudding user response:
Columns used as index columns in Featuretools cannot contain missing (null) values and the values must be unique. Based on the error message you are seeing, it seems that the name
column you are attempting to use as an index has null values.
You will need to drop these rows from your dataframe prior to adding the dataframe to the Featuretools EntitySet. You can drop all rows in your dataframe for which name
is null with this:
season_stats = season_stats.dropna(subset="name")