Home > Back-end >  creating a dataframe using lists
creating a dataframe using lists

Time:02-10

I am trying to create a dataframe that looks like this excel sheet but I can't figure out how to do so. Here is the code I am attempting to use

import pandas as pd 
ls_super = ['Supernatant',total_volume_super_ml, Total_activity_super, total_protein_super_mg, specific_activity_sup, 100, 1]
df3 = pd.DataFrame(ls_super, columns =['Sample','Total Volume','Total activity','Total protein','Specific Activity','percent yield,','purification' 
])

enter image description here

Here is the error message: ValueError Traceback (most recent call last) /tmp/ipykernel_125580/4224246098.py in 20 21 # list of strings ---> 22 df3 = pd.DataFrame(ls_super, columns =['Sample','Total Volume','Total activity','Total protein','Specific Activity','percent yield,','purification' 23 ]) 24

~/.local/lib/python3.8/site-packages/pandas/core/frame.py in init(self, data, index, columns, dtype, copy) 709 ) 710 else: --> 711 mgr = ndarray_to_mgr( 712 data, 713 index,

~/.local/lib/python3.8/site-packages/pandas/core/internals/construction.py in ndarray_to_mgr(values, index, columns, dtype, copy, typ) 322 ) 323 --> 324 _check_values_indices_shape_match(values, index, columns) 325 326 if typ == "array":

~/.local/lib/python3.8/site-packages/pandas/core/internals/construction.py in _check_values_indices_shape_match(values, index, columns) 391 passed = values.shape 392 implied = (len(index), len(columns)) --> 393 raise ValueError(f"Shape of passed values is {passed}, indices imply {implied}") 394 395

ValueError: Shape of passed values is (7, 1), indices imply (7, 7)

CodePudding user response:

Problem: the DataFrame() constructor insists on interpreting the one-dimensional python list ls_super as 1 column and 7 rows ("Shape of passed values is (7, 1)"), as opposed to 1 row with 7 columns (which would have shape=(1, 7)).

Solution: add a second set of brackets ([]) around your definition of the ls_super list. In other words, make ls_super a two-dimensional list. The DataFrame constructor then sees a single value in the first dimension, and seven values in the second dimension, producing the desired shape of (1, 7).

ls_super = [['Supernatant',1, 2, 3, 4, 100, 1]]

df3 = pd.DataFrame(ls_super, 
                   columns=['Sample', 'Total Volume', 'Total activity', 'Total protein', 'Specific Activity', 'percent yield', 'purification'])

CodePudding user response:

import pandas as pd

first_row_list = [
    "Supernatant",
    total_volume_super_ml,
    Total_activity_super,
    total_protein_super_mg,
    specific_activity_sup,
    100,
    1
]

columns = [
    "Sample",
    "Total Volume",
    "Total Activity",
    "Total protein",
    "Specific Activity",
    "percent yield",
    "purification"
]

d = dict(zip(columns, [[f] for f in first_row_list]))
df = pd.DataFrame(d)

or

d = {'Sample': ["Supernatant"],
      'Total Volume': [total_volume_super_ml],
      'Total Acitivity': [Total_activity_super],
      'Total protein': [total_protein_super_mg],
      'Specific Activity': [specific_activity_sup],
      'percent yield': [100],
      'purification': [1]}

df = pd.DataFrame(d)

CodePudding user response:

In pandas dataframe your input could be an a list of list.

import pandas as pd
from random import uniform

total_volume_super_ml = uniform(0,1)
Total_activity_super = uniform(0,1)
total_protein_super_mg = uniform(0,1)
specific_activity_sup = uniform(0,1)

ls_super = ['Supernatant',total_volume_super_ml, Total_activity_super, total_protein_super_mg, specific_activity_sup, 100, 1]
df3 = pd.DataFrame([ls_super], columns =['Sample','Total Volume','Total activity','Total protein','Specific Activity','percent yield','purification'])
df3
  • Related