Home > Software engineering >  How to set a list as a value in a dataframe?
How to set a list as a value in a dataframe?

Time:11-14

I want to insert the GitHub user's monthly activity (data type is list, with different lengths) into cells under columns with corresponding years & months (e.g., 2021_01, 2022_10).

enter image description here

The Xpath of these texts is:

//*[@id="js-contribution-activity"]/div/div/div/div

This is what my csv file (df1) looks like:

LinkedIn Website GitHub Website user
0 https://www.linkedin.com/in/chad-roberts-b86699/ https://github.com/crobby crobby
1 https://www.linkedin.com/in/grahamdumpleton/ https://github.com/GrahamDumpleton GrahamDumpleton

Here is my best try so far:

import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
    
driver = webdriver.Chrome("/Users/fredr/Desktop/chromedriver")

for index, row in df1.iterrows():
    try:
        user = row["user"]
    except:
        pass
    for y in range(2019, 2021):
        for m in range(8, 11):
            current_url = f"https://github.com/{user}?tab=overview&from={y}-{str(m).zfill(2)}-01&to={y}-{str(m).zfill(2)}-31"
            wait = WebDriverWait(driver, 30)
            driver.get(current_url)
            contributions = wait.until(
                EC.visibility_of_all_elements_located(
                    (By.XPATH, "//*[@id='js-contribution-activity']/div/div/div/div")
                )
            )
            list_cont = []
            for contribution in contributions:
                list_cont.append(contribution.text)
            df1.loc[index, f"{str(y)}_{str(m)}"] = list_cont

But it gives me the error as follows:

    ValueError                                Traceback (most recent call last)
    <ipython-input-101-40d6825cbbdb> in <module>
         15                 print(value.text)
         16                 list_cont.append(value.text)
    ---> 17             df1.loc[index, f'{str(y)}_{str(m)}'] = list_cont
    
    ~\anaconda3\lib\site-packages\pandas\core\indexing.py in __setitem__(self, key, value)
        690 
        691         iloc = self if self.name == "iloc" else self.obj.iloc
    --> 692         iloc._setitem_with_indexer(indexer, value, self.name)
        693 
        694     def _validate_key(self, key, axis: int):
    
    ~\anaconda3\lib\site-packages\pandas\core\indexing.py in _setitem_with_indexer(self, indexer, value, name)
       1633         if take_split_path:
       1634             # We have to operate column-wise
    -> 1635             self._setitem_with_indexer_split_path(indexer, value, name)
       1636         else:
       1637             self._setitem_single_block(indexer, value, name)
    
    ~\anaconda3\lib\site-packages\pandas\core\indexing.py in _setitem_with_indexer_split_path(self, indexer, value, name)
       1686                     return self._setitem_with_indexer((pi, info_axis[0]), value[0])
       1687 
    -> 1688                 raise ValueError(
       1689                     "Must have equal len keys and value "
       1690                     "when setting with an iterable"
    
    ValueError: Must have equal len keys and value when setting with an iterable

CodePudding user response:

You get this message because you are trying to set a value at a specific index and column, but you pass a list of values.

If your intention is to use the list itself as a value, then:

  • replace df1.loc[index, f'{str(y)}_{str(m)}'] = list_cont
  • with df1.loc[index, f"{str(y)}_{str(m)}"] = str(list_cont)

Then:

print(df1)
# Output
                                   LinkedIn Website  \
0  https://www.linkedin.com/in/chad-roberts-b86699/   
1      https://www.linkedin.com/in/grahamdumpleton/   

                       GitHub Website             user  \
0           https://github.com/crobby           crobby   
1  https://github.com/GrahamDumpleton  GrahamDumpleton   

   2019_8  
0  ['', 'Created an issue in thoth-station/s2i-thoth that received 6 comments\nAug 19\ns2i build not activating Thoth\nI have some source with all of the Thoth env variables set and my project includes a .thoth file, but when I do the s2i build process, it never see…\n6 comments', '', 'Opened 1 other issue in 1 repository\nargoproj/argo-workflows\n1 closed\nPermissions for files from s3 input artifact "directory" too restrictive\nAug 26']  
1  ['', 'Created 260 commits in 13 repositories', '', 'Created a pull request in openshift-labs/lab-tekton-pipelines that received 12 comments\nAug 12\nFixes so works out of the box with OCP 4.\nThe original master did not work out of the box on a fresh OCP 4 cluster with Subscription and CatalogSource resources breaking ability to install …\n 107 −911 •\n12 comments', '', 'Opened 3 other pull requests in 1 repository\nsjbylo/lab-ocp4\n3 merged\nMinor changes to make some things easier.\nAug 6\...  
  • Related