I understood the former one gives me a series whereas the letter gives a dataframe. What I couldn't get is its arguments. df[['column_name']]
is giving dataframe. Is that the reason cuz I'm sending ['column_name']
an iterative as its data=
parameter? I'm struggling how python is working here! My results are following:
df['Yil']=
bir 2021
ikki 2020
19 2019
18 2018
17 2017
16 2016
15 2015
10 2010
df[['Yil']]=
Yil
bir 2021
ikki 2020
19 2019
18 2018
17 2017
16 2016
15 2015
10 2010
Name: Yil, dtype: int64
CodePudding user response:
df['column_name']
returns a Series that is that column
df[['column_name']]
returns a DataFrame that has one column named column_name
which you clearly noticed...
dataframes have some different methods available to them vs series. it's hard to tell which one you want to use without more info.
CodePudding user response:
For selecting certain columns of a dataframe, the indexing can't be just any iterable. (For example, strings are iterable.) According to the documentation, it has to be a list, although from some quick testing, some other iterables will work:
Iterators
In [2]: df = pd.DataFrame({'a': [2, 3], 'b': [4, 5], 'c': [6, 7]})
In [3]: df[['a']]
Out[3]:
a
0 2
1 3
In [4]: df[iter(['a'])] # Dummy iterator
Out[4]:
a
0 2
1 3
In [5]: df[(x for x in ['a'])] # Dummy generator, a kind of iterator
Out[5]:
a
0 2
1 3
Ranges
In [6]: df1 = pd.DataFrame([['a', 'b'], ['c', 'd']])
In [7]: df1[range(1)]
Out[7]:
0
0 a
1 c
Dicts and sets also work, but they're deprecated.
In contrast, a tuple cannot be used to select multiple columns:
In [8]: df[('a',)]
Traceback (most recent call last):
...
KeyError: ('a',)
Because it needs to be possible to do multilevel column indexing:
In [9]: df2 = pd.DataFrame(
...: [[2, 4], [3, 5]],
...: columns=pd.MultiIndex.from_tuples([('a', 'b'), ('a', 'c')]))
In [10]: df2
Out[10]:
a
b c
0 2 4
1 3 5
In [11]: df2[('a', 'c')]
Out[11]:
0 4
1 5
Name: (a, c), dtype: int64