I'm trying to generate a DataFrame that has random values; something like this:
In [75]: df
Out[75]:
Name mag1 mag2 mag3 redshift
0 Galaxy 1 11.657170 12.881492 14.230583 0.1125
1 Galaxy 2 19.720113 14.297871 NaN 1.2252
2 Galaxy 3 11.026038 11.116287 17.689447 2.5548
3 Galaxy 4 NaN 16.218209 11.928297 1.8845
4 Galaxy 5 15.287412 19.199692 19.392112 4.5512
5 Galaxy 6 12.283413 12.425423 19.141460 0.9583
6 Galaxy 7 18.738156 NaN 16.179031 1.8271
7 Galaxy 8 16.277030 13.728240 11.800716 2.8819
8 Galaxy 9 16.672178 14.608468 10.145000 3.9710
9 Galaxy 10 17.836160 17.828570 13.813578 0.2790
The columns have been generated with
col0 = ['Galaxy 1','Galaxy 2','Galaxy 3','Galaxy 4','Galaxy 5','Galaxy 6','Galaxy 7','Galaxy 8','Galaxy 9','Galaxy 10']
col1 = np.array([np.random.uniform(10, 20, 10)])
col2 = np.array([np.random.uniform(10, 20, 10)])
col3 = np.array([np.random.uniform(10, 20, 10)])
col4 = np.array([np.random.uniform(0.01, 5, 10)])
and stitched together with
df = pd.DataFrame(list(zip(col0, col1, col2, col3, col4)))
The NaN
s were inserted manually (no Nan
s in redshift
).
This works fine, but how could I automate this to produce a random DataFrame with a variable number of mag
s but with a similar structure? Perhaps with a call like df = random_df(size = (20, 5)
for 20 Galaxies and 5 mag columns?
CodePudding user response:
import pandas as pd
import numpy as np
def make_test_df(n_galaxies=10, n_mags=3, seed=0):
np.random.seed(seed)
data = np.random.uniform(10, 20, (n_galaxies,n_mags))
data[(np.random.choice(n_galaxies, n_mags, replace=False), range(n_mags))] = np.nan
df = pd.DataFrame(data, columns=[f'mag{i}' for i in range(1, n_mags 1)])
df.insert(0, 'Name', [f'Galaxy {i}' for i in range(1, n_galaxies 1)])
df['redshift'] = np.random.uniform(0.01, 5, n_galaxies)
return df
Result of make_test_df(20, 5)
:
Name mag1 mag2 mag3 mag4 mag5 redshift
0 Galaxy 1 15.488135 17.151894 16.027634 15.448832 14.236548 1.494210
1 Galaxy 2 16.458941 14.375872 NaN 19.636628 13.834415 4.070851
2 Galaxy 3 17.917250 15.288949 15.680446 19.255966 10.710361 1.988564
3 Galaxy 4 10.871293 10.202184 18.326198 17.781568 18.700121 4.406705
4 Galaxy 5 19.786183 17.991586 14.614794 17.805292 11.182744 2.910552
5 Galaxy 6 16.399210 11.433533 19.446689 NaN 14.146619 4.409859
6 Galaxy 7 NaN 17.742337 14.561503 15.684339 10.187898 3.465733
7 Galaxy 8 16.176355 16.120957 16.169340 19.437481 16.818203 3.629019
8 Galaxy 9 13.595079 14.370320 16.976312 10.602255 16.667667 2.511609
9 Galaxy 10 16.706379 NaN 11.289263 13.154284 13.637108 4.780857
10 Galaxy 11 15.701968 14.386015 19.883738 11.020448 12.088768 3.223511
11 Galaxy 12 11.613095 16.531083 12.532916 14.663108 12.444256 2.125037
12 Galaxy 13 11.589696 11.103751 16.563296 11.381830 11.965824 3.035902
13 Galaxy 14 13.687252 18.209932 10.971013 18.379449 10.960984 0.105774
14 Galaxy 15 19.764595 14.686512 19.767611 16.048455 NaN 1.514858
15 Galaxy 16 10.391878 12.828070 11.201966 12.961402 11.187277 3.304266
16 Galaxy 17 13.179832 14.142630 10.641475 16.924721 15.666015 1.457487
17 Galaxy 18 12.653895 15.232481 10.939405 15.759465 19.292962 3.093897
18 Galaxy 19 13.185690 16.674104 11.317979 17.163272 12.894061 2.149556
19 Galaxy 20 11.831914 15.865129 10.201075 18.289400 10.046955 0.686016