Dataframe from list-CodePudding

I am trying to create a dataframe from multiple lists. Each list is a row that i want to append to my dataframe.

for i in tmpList:
    data = data.append(getTFSData(i))

tmpList contains a list of IDs, getTFSData() fetches some data via webrequest und returns a list of those values:

responseDict = [
    responseRaw['id'],
    responseRaw['Title'],
    responseRaw['state'],
    responseRaw['IterationPath'],
    responseRaw['Tags']
]

return responseDict

I am expecting each value of the list to be a coloumn, but instead each value is a row in coloumn 0

CodePudding user response：

You better keep the response raw dictionary:

import random
import pandas as pd


def main():
    ids_list = [1, 2, 3, 4, 5]
    data = [get_tfs_data(data_id) for data_id in ids_list]
    df = pd.DataFrame(data)
    print(df)


def get_tfs_data(data_id):
    response_raw = {
        "id": random.randint(0, 10),
        "Title": random.randint(0, 10),
        "state": random.randint(0, 10),
        "IterationPath": random.randint(0, 10),
        "Tags": random.randint(0, 10),
    }
    return response_raw


if __name__ == "__main__":
    main()

   id  Title  state  IterationPath  Tags
0   1      3      9              9     3
1   4      3      2              0     9
2   5      2      2              2     3
3   3      1      7             10     6
4   0      8      1              6     5

CodePudding user response：

In general, it's better to create a data structure first and then use one of the dataframe constructors to create the dataframe in one go. Note that append has been deprecated since pandas 1.4.0.

If you have a list of values where each sublist is a column, not a row, you can zip them with a sequence of column names and pass the resulting dictionary to pd.DataFrame.

>>> data = [['a', 'b', 'c'], ['d', 'e', 'f']]
>>> cols = ["foo", "bar"]
>>> df = pd.DataFrame(dict(zip(cols, data)))
>>> df
  foo bar
0   a   d
1   b   e
2   c   f