I want to make for loop that makes the list of each columns. There are a lot of columns so can I use df[i] instead of columns name?
ex:
df = {
'A': [apple, hello, carrot],
'B': [4, 5, 6],
'C': [7, 8, 9]}
for i in df:
df[i] = list(df.select(df[i]).toPandas()[df[i]]
I want output
a: apple, hello, carrot
b: 4,5,6
c: 7,8,9
CodePudding user response:
To obtain the list of columns, one option is:
iterable = df.columns.to_list()
Then you can iterate through that list that you have just created. One way of creating the desired output is:
for i in list(df.columns):
print(i ": " df[i].values)
CodePudding user response:
From the functions you're using (e.g. toPandas()
), it seems like you may be using PySpark, but if so you should make that clear in your question.
I'm going to ignore the PySpark part and assume we're just talking about a Pandas DataFrame:
>>> import pandas as pd
>>> df = pd.DataFrame({'A':['apple', 'hello', 'carrot'], 'B':[4, 5, 6], 'C':[7, 8, 9] })
>>> df
A B C
0 apple 4 7
1 hello 5 8
2 carrot 6 9
DataFrames have three primary ways to access rows, columns, and cells.
The first way is by indexing by a row name directly on the DataFrame. Example:
>>> df['A']
0 apple
1 hello
2 carrot
The second is with .loc[rowindexvalue, colname]
. To select the 'A' column, you'd put :
for the rowindex portion which tells Pandas select all rows. Example:
>>> df.loc[:, 'A']
0 apple
1 hello
2 carrot
Name: A, dtype: object
The third way is with .iloc[rowindex, colindex]
. You can only use integer indexes with .iloc
(cannot use column names). So to select the first column and all rows in our example, you'd do this:
>>> df.iloc[:, 0]
0 apple
1 hello
2 carrot
Name: A, dtype: object
To convert any of the above examples into a Python list, you can simply wrap it in a list()
function. Using our first example above, that would be:
>>> list(df['A'])
['apple', 'hello', 'carrot']
Finally, you can iterate over the columns like this:
>>> for c in df.columns:
... print(f"{c}: {list(df[c])}")
...
A: ['apple', 'hello', 'carrot']
B: [4, 5, 6]
C: [7, 8, 9]
>>>