Home > Back-end >  Taking python output to a pandas dataframe
Taking python output to a pandas dataframe

Time:02-27

I'm trying to take the output from this code into a pandas dataframe. I'm really only trying to pull the first part of the output which is the stock symbols,company name, field3, field4. The output has a lot of other data I'm not interested in but it's giving me everything. Could someone help me to put this into a dataframe if possible?

The current output is in this format

["ABBV","AbbVie","_DRUGM","S&P 100, S&P 500"],["ABC","AmerisourceBergen","_MEDID","S&P 500"],

Desired Output

Desired Output

Full Code

    import requests
    import pandas as pd
    import requests
    
    url = "https://www.stockrover.com/build/production/Research/tail.js?1644930560"
    
    payload={}
    headers = {}
    
    response = requests.request("GET", url, headers=headers, data=payload)
    
    print(response.text)

CodePudding user response:

Use a dictionary to store the data from your tuple of lists, then create a DataFrame based on that dictionary. In my solution below, I omit the 'ID' field because the index of the DataFrame serves the same purpose.

import pandas as pd

# Store the data you're getting from requests
data = ["ABBV","AbbVie","_DRUGM","S&P 100, S&P 500"],["ABC","AmerisourceBergen","_MEDID","S&P 500"]

# Create an empty dictionary with relevant keys
dic = {
    "Ticker": [],
    "Name": [],
    "Field3": [],
    "Field4": []
}

# Append data to the dictionary for every list in your `response`
for pos, lst in enumerate(data):
    dic['Ticker'].append(lst[0])
    dic['Name'].append(lst[1])
    dic['Field3'].append(lst[2])
    dic['Field4'].append(lst[3])

# Create a DataFrame from the dictionary above
df = pd.DataFrame(dic)

The resulting dictionary looks like so.

Resulting DataFrame from code above.

Edit: A More Efficient Approach

In my solution above, I manually called the list form of each key in the dic dictionary. Using zip we can streamline the process and have it work for any length response and any changes you make to the labels of the dictionary.

The only caveat to this method is that you have to make sure the order of keys in the dictionary lines up with the data in each list in your response. For example, if Ticker is the first dictionary key, the ticker must be the first item in the list resulted from your response. This was true for the first solution, too, however.

new_dic = {
    "Ticker": [],
    "Name": [],
    "Field3": [],
    "Field4": []
}

for pos, lst in enumerate(data): # Iterate position and list
    for key, item in zip(new_dic, data[pos]): # Iterate key and item in list
        new_dic[key].append(item) # Append to each key the item in list

df = pd.DataFrame(new_dic)

The result is identical to the method above:

Identical result using more efficient code.

Edit (even better!)

I'm coming back to this after learning from a commenter that pd.DataFrame() can input two-dimensional array data and output a DataFrame. This would streamline the entire process several times over:

import pandas as pd

# Store the data you're getting from requests
data = ["ABBV","AbbVie","_DRUGM","S&P 100, S&P 500"],["ABC","AmerisourceBergen","_MEDID","S&P 500"]

# Define columns
columns = ['ticker', 'name', 'field3', 'field4']

df = pd.DataFrame(data, columns = columns)

The result (same as first two):

enter image description here

  • Related