I've scraped the crypto.com website to get the current prices of crypto coins in DataFrame form, it worked perfectly with pandas, but the 'Prices' values are mixed.
here's the output:
Name Price 24H CHANGE
0 BBitcoinBTC 16.678,36$16.678,36 0,32% 0,32%
1 EEthereumETH $1.230,40$1.230,40 0,52% 0,52%
2 UTetherUSDT $1,02$1,02-0,01% -0,01%
3 BBNBBNB $315,46$315,46-0,64% -0,64%
4 UUSD CoinUSDC $1,00$1,00 0,00% 0,00%
5 BBinance USDBUSD $1,00$1,00 0,00% 0,00%
6 XXRPXRP $0,4067$0,4067-0,13% -0,13%
7 DDogecoinDOGE $0,1052$0,1052 13,73% 13,73%
8 ACardanoADA $0,3232$0,3232 0,98% 0,98%
9 MPolygonMATIC $0,8727$0,8727 1,20% 1,20%
10 DPolkadotDOT $5,48$5,48 0,79% 0,79%
I created a regex to filter the mixed date:
import re
pattern = re.compile(r'(\$.*)(\$)')
for value in df['Price']:
value = pattern.search(value)
print(value.group(1))
output:
$16.684,53
$1.230,25
$1,02
$315,56
$1,00
$1,00
$0,4078
$0,105
$0,3236
$0,8733
but I couldn't find a way to change the values. Which is the best way to do it? Thanks.
CodePudding user response:
if youre regex expression is good, this would work
df['Price']= df['Price'].apply(lambda x: pattern.search(x).group(1))
CodePudding user response:
can you try this:
df['price_v2']=df['Price'].apply(lambda x: '$' x.split('$')[1])
'''
0 $16.678,36 0,32%
1 $1.230,40
2 $1,02
3 $315,46
4 $1,00
5 $1,00
6 $0,4067
7 $0,1052
8 $0,3232
9 $0,8727
10 $5,48
Name: price, dtype: object
Also, BTC looks different from others. Is this a typo you made or is this the response from the api ? If there are parities that look like BTC, we can add an if else block to the code:
df['price']=df['Price'].apply(lambda x: '$' x.split('$')[1] if x.startswith('$') else '$' x.split('$')[0])
'''
0 $16.678,36
1 $1.230,40
2 $1,02
3 $315,46
4 $1,00
5 $1,00
6 $0,4067
7 $0,1052
8 $0,3232
9 $0,8727
10 $5,48
'''
Detail:
string = '$1,02$1,02-0,01%'
values = string.split('$') # output -- > ['', '1,02', '1,02-0,01%']
final_value = values[1] # we need only price. Thats why i choose the second element and apply this to all dataframe.