I am trying to download dataset from Kaggle through my django app. In my utils, I have this code:
def search_kaggle(search_term):
search_results = os.popen("kaggle datasets list -s " search_term).read().splitlines()
return search_results
On my view function, I have this:
def search_dataset(request):
context = {
}
print('search dataset reached')
if request.method == "POST":
searchkey = request.POST["searchkey"]
dtsite = request.POST["dtsite"]
dtsnum = request.POST["dtsnum"]
if searchkey != "":
if dtsite == "kaggle":
results = search_kaggle(dtsite)
context['results'] = results
print("Kaggle reached")
if dtsite == "datagov":
print("datagov")
if dtsite == "uci":
print("UCI")
if dtsite == "googlepd":
print("googlepd")
else:
messages.error(request, " You must select a search keyword!")
return render(request, 'datasetsearch/dataset_results.html', context)
When I run the code, it actually returns some data from Kaggle but this data is totally different from what I get when I run the same command in CLI using:
kaggle datasets list -s 'fraud detection'
In the code above, the search_term = 'fraud detection' so I believe it should return the same form of data but I am getting something different. The result of the command line is the correct result.
See Command line result
ref title size lastUpdated downloadCount v
---------------------------------------------------------------- -------------------------------------------------- ----- ------------------- ------------- -
mlg-ulb/creditcardfraud Credit Card Fraud Detection 66MB 2018-03-23 01:17:27 430457
ealaxi/paysim1 Synthetic Financial Datasets For Fraud Detection 178MB 2017-04-03 08:40:34 55698
mishra5001/credit-card Credit Card Fraud Detection 112MB 2019-07-15 06:36:02 8706
kartik2112/fraud-detection Credit Card Transactions Fraud Detection Dataset 202MB 2020-08-05 15:20:55 13158
rohitrox/healthcare-provider-fraud-detection-analysis HEALTHCARE PROVIDER FRAUD DETECTION ANALYSIS 25MB 2019-05-09 19:50:55 11674
rupakroy/online-payments-fraud-detection-dataset Online Payments Fraud Detection Dataset 178MB 2022-04-17 15:34:44 3985
vagifa/ethereum-frauddetection-dataset Ethereum Fraud Detection Dataset 923KB 2021-01-03 10:05:14 1418
shayannaveed/credit-card-fraud-detection Credit Card Fraud Detection 66MB 2019-12-24 08:07:24 1233
shivamb/vehicle-claim-fraud-detection Vehicle Insurance Claim Fraud Detection 348KB 2021-12-20 04:26:36 2325
saurabhbagchi/credit-card-fraud-detection Credit Card Fraud Detection 28MB 2021-07-18 14:27:20 909
volodymyrgavrysh/fraud-detection-bank-dataset-20k-records-binary Fraud detection bank dataset 20K records binary 738KB 2021-08-08 15:12:01 2184
isaikumar/creditcardfraud Credit Card Fraud Detection Dataset 66MB 2018-05-05 09:38:01 4386
gopalmahadevan/fraud-detection-example Fraud Detection Example 3MB 2021-08-01 02:31:29 652
tanisha1416/promo-abuse-detection-for-payment-apps Promo Code Abuse Detection (Fraud Detection) 25KB 2021-08-07 07:13:13 208
ealtman2019/credit-card-transactions Credit Card Transactions 263MB 2021-10-14 17:42:24 2542
ealaxi/banksim1 Synthetic data from a financial payment system 13MB 2017-07-11 14:48:56 23766
dhanushnarayananr/credit-card-fraud Credit Card Fraud 29MB 2022-05-07 15:09:29 2833
muhakabartay/yourallmodelsdata IEEE-CIS Fraud Detection Models Data 28MB 2019-09-18 07:57:04 125
dileep070/anomaly-detection Credit card fraud detection 43MB 2019-06-19 06:00:05 962
mrmorj/fraud-detection-in-electricity-and-gas-consumption Fraud Detection in Electricity and Gas Consumption 87MB 2020-08-24 12:29:16 1205
See the python script result:
ref title size lastUpdated downloadCount voteCount usabilityRating
------------------------------------- -------------------------------------------------- ----- ------------------- ------------- --------- ---------------
kaggle/meta-kaggle Meta Kaggle 6GB 2022-08-01 06:39:59 10828 653 0.7647059
kaggle/kaggle-survey-2018 2018 Kaggle Machine Learning & Data Science Survey 4MB 2018-11-03 22:35:07 17710 1008 0.85294116
kaggle/world-development-indicators World Development Indicators 369MB 2017-05-01 17:50:44 62053 1604 0.7647059
kaggle/kaggle-survey-2017 2017 Kaggle Machine Learning & Data Science Survey 4MB 2017-10-27 22:03:03 25672 854 0.8235294
kaggle/sf-salaries SF Salaries 11MB 2019-12-05 23:30:07 54209 713 0.7058824
alsgroup/end-als End ALS Kaggle Challenge 12GB 2021-04-08 12:16:37 1485 177 0.9375
kaggle/hillary-clinton-emails Hillary Clinton's Emails 12MB 2019-11-14 05:31:24 17379 288 0.7058824
kaggle/college-scorecard US Dept of Education: College Scorecard 562MB 2017-11-09 18:03:11 14214 214 0.7647059
kaggle/recipe-ingredients-dataset Recipe Ingredients Dataset 2MB 2017-01-19 02:55:45 11082 195 0.75
kaggle/reddit-comments-may-2015 May 2015 Reddit Comments 20GB 2019-06-04 10:06:44 9124 280 0.64705884
kaggle/us-baby-names US Baby Names 173MB 2017-11-21 22:18:15 29489 320 0.5882353
morriswongch/kaggle-datasets Kaggle Datasets 3MB 2018-12-02 03:50:47 1819 72 0.8235294
kaggle/us-consumer-finance-complaints US Consumer Finance Complaints 84MB 2019-11-14 05:52:29 17837 286 0.5882353
pavlofesenko/titanic-extended Titanic extended dataset (Kaggle Wikipedia) 134KB 2019-03-06 09:53:24 9419 133 0.9411765
canggih/voted-kaggle-dataset Upvoted Kaggle Datasets 1MB 2018-02-26 10:10:34 1268 33 1.0
canggih/upvoted-kaggle-kernels Upvoted Kaggle Kernels 115KB 2018-02-26 16:52:28 207 27 1.0
jessevent/all-kaggle-datasets Complete Kaggle Datasets Collection 390KB 2018-01-16 12:32:58 2099 109 0.8235294
kaggle/no-data-sources No Data Sources 159B 2017-04-12 20:45:12 1144 139 0.4375
kaggle/kaggle-blog-winners-posts Kaggle Blog: Winners' Posts 519KB 2016-09-21 02:21:21 766 43 0.7058824
kaggle/2015-notebook-ux-survey 2015 Notebook UX Survey 198KB 2017-05-01 17:56:25 1033 49 0.64705884
CodePudding user response:
You are not passing the search term to the function call search_kaggle()
; but the string kaggle
via variable dtsite
:
if dtsite == "kaggle":
results = search_kaggle(dtsite)
Change this to:
if dtsite == "kaggle":
results = search_kaggle(searchkey)