How can the money
type that PostgreSQL offers be robustly parsed, to extract both the value and the currency symbol? (In Python, but something a bit non-language specific is also welcome)
The problems I think are that various components can change. e.g currency symbol can vary, as its position, as well as the symbol for what I would call decimal point, and maybe even the negative symbol...
Context: I'm writing a PostgreSQL adapter for Python, and wondering whether to parse money
output, or leave it as a string.
Here is a list of all(?) 148 possible ways 12345.67 and -12345.67 can be format on my system based on the lc_monetary
setting. (There are more lc_monetary
possibilities but I've removed values that would duplicate output in this list)
lc_monetary | 12345.67 | -12345.67 |
---|---|---|
aa_DJ.iso88591 | $12 345.67 | -$12 345.67 |
aa_ER.utf8 | $ 12,346 | -$ 12,346 |
aa_ET.utf8 | $12,345.67 | -$12,345.67 |
af_ZA.iso88591 | R12,345.67 | -R12,345.67 |
an_ES.iso885915 | 12.345,67 € | -12.345,67 € |
ar_AE.iso88596 | د.إ. 12,345.670 | د.إ. 12,345.670- |
ar_BH.iso88596 | د.ب. 12,345.670 | د.ب. 12,345.670- |
ar_DZ.iso88596 | د.ج. 12,345.670 | د.ج. 12,345.670- |
ar_EG.iso88596 | ج.م. 12,345.670 | ج.م. 12,345.670- |
ar_IN.utf8 | ₹ 12,345.67 | -₹ 12,345.67 |
ar_IQ.iso88596 | د.ع. 12,345.670 | د.ع. 12,345.670- |
ar_JO.iso88596 | د.أ. 12,345.670 | د.أ. 12,345.670- |
ar_KW.iso88596 | د.ك. 12,345.670 | د.ك. 12,345.670- |
ar_LB.iso88596 | ل.ل. 12,345.670 | ل.ل. 12,345.670- |
ar_LY.iso88596 | د.ل. 12,345.670 | د.ل. 12,345.670- |
ar_MA.iso88596 | د.م. 12,345.670 | د.م. 12,345.670- |
ar_OM.iso88596 | ر.ع. 12,345.670 | ر.ع. 12,345.670- |
ar_QA.iso88596 | ر.ق. 12,345.670 | ر.ق. 12,345.670- |
ar_SA.iso88596 | 12,345.67 ريال | -12,345.67 ريال |
ar_SD.iso88596 | ج.س. 12,345.670 | ج.س. 12,345.670- |
ar_SY.iso88596 | ل.س. 12,345.670 | ل.س. 12,345.670- |
ar_TN.iso88596 | د.ت. 12,345.670 | د.ت. 12,345.670- |
ar_YE.iso88596 | ر.ي. 12,345.670 | ر.ي. 12,345.670- |
ayc_PE.utf8 | S/ 12,345.67 | -S/ 12,345.67 |
az_AZ.utf8 | 12 345.67 man. | -12 345.67 man. |
be_BY.cp1251 | 12 345.67 руб | -12 345.67 руб |
be_BY.utf8@latin | 12 345.67 Rub | -12 345.67 Rub |
bem_ZM.utf8 | K12,345.67 | -K12,345.67 |
ber_MA.utf8 | ⴷ.ⵎ. 12,345.670 | ⴷ.ⵎ. 12,345.670- |
bg_BG.cp1251 | 12 345,67 лв | -12 345,67 лв |
bn_BD.utf8 | ৳ 12,345.67 | -৳ 12,345.67 |
bn_IN.utf8 | ₹ 1,23,45.67 | -₹ 1,23,45.67 |
bo_CN.utf8 | ¥12,345.67 | ¥-12,345.67 |
br_FR.iso88591 | 12 345,67 EUR | -12 345,67 EUR |
br_FR.iso885915@euro | 12 345,67 € | -12 345,67 € |
bs_BA.iso88592 | KM 12 345,67 | -KM 12 345,67 |
ca_AD.iso885915 | € 12.345,67 | -€ 12.345,67 |
ca_ES.iso88591 | EUR 12.345,67 | -EUR 12.345,67 |
crh_UA.utf8 | 12 345.67 gr | -12 345.67 gr |
cs_CZ.iso88592 | 12 345,67 Kč | -12 345,67 Kč |
csb_PL.utf8 | 12.345,67 zł | -12.345,67 zł |
cv_RU.utf8 | 12 345.67 t | -12 345.67 t |
cy_GB.iso885914 | £12,345.67 | -£12,345.67 |
da_DK.iso88591 | kr 12.345,67 | kr -12.345,67 |
de_AT.iso88591 | EUR 12 345,67 | -EUR 12 345,67 |
de_AT.iso885915@euro | € 12 345,67 | -€ 12 345,67 |
de_BE.iso88591 | EUR 12.345,67 | EUR- 12.345,67 |
de_BE.iso885915@euro | € 12.345,67 | €- 12.345,67 |
de_CH.iso88591 | Fr. 12'345.67 | Fr.- 12'345.67 |
de_DE.iso88591 | 12.345,67 EUR | -12.345,67 EUR |
dv_MV.utf8 | ރ. 12,345.67 | -ރ.12,345.67 |
dz_BT.utf8 | དངུལ་ཀྲམ་ 12,345.670 | དངུལ་ཀྲམ་- 12,345.670 |
el_CY.iso88597 | 12.345,67€ | -€12.345,67 |
en_BW.iso88591 | Pu12,345.67 | -Pu12,345.67 |
en_DK.iso88591 | ¤12.345,67 | -¤12.345,67 |
en_HK.iso88591 | HK$12,345.67 | (HK$12,345.67) |
en_IE.iso88591 | EUR12,345.67 | -EUR12,345.67 |
en_IE.iso885915@euro | €12,345.67 | -€12,345.67 |
en_NG.utf8 | ₦12,345.67 | -₦12,345.67 |
en_PH.iso88591 | Php12,345.67 | (Php12,345.67) |
en_SG.iso88591 | $12,345.67 | ($12,345.67) |
en_ZW.iso88591 | Z$12,345.67 | -Z$12,345.67 |
es_AR.iso88591 | $ 12.345,67 | -$ 12.345,67 |
es_BO.iso88591 | $b 12.345,67 | -$b 12.345,67 |
es_CR.iso88591 | C= 12 345,67 | -C= 12 345,67 |
es_CR.utf8 | ₡ 12 345,67 | -₡ 12 345,67 |
es_CU.utf8 | 12 345,67 $ | -12 345,67 $ |
es_DO.iso88591 | $ 12,345.67 | -$ 12,345.67 |
es_GT.iso88591 | Q 12,345.67 | -Q 12,345.67 |
es_HN.iso88591 | L. 12,345.67 | -L. 12,345.67 |
es_NI.iso88591 | C$ 12,345.67 | -C$ 12,345.67 |
es_PA.iso88591 | B/ 12,345.67 | -B/ 12,345.67 |
es_PY.iso88591 | Gs. 12.345,67 | -Gs. 12.345,67 |
es_SV.iso88591 | C= 12,345.67 | -C= 12,345.67 |
es_SV.utf8 | ₡ 12,345.67 | -₡ 12,345.67 |
es_VE.iso88591 | Bs. 12.345,67 | -Bs. 12.345,67 |
et_EE.iso88591 | EUR 12 345,67 | -EUR 12 345,67 |
et_EE.iso885915 | € 12 345,67 | -€ 12 345,67 |
eu_ES.iso885915@euro | € 12.346 | -€ 12.346 |
fa_IR.utf8 | 12٬346 ریال | -12٬346 ریال |
ff_SN.utf8 | 12,345.67 CFA | -12,345.67 CFA |
fi_FI.iso88591 | 12 345,67 EUR | -12 345,67 EUR |
fi_FI.iso885915@euro | 12 345,67 € | -12 345,67 € |
fil_PH.utf8 | PhP12,345.67 | -PhP 12,345.67 |
fr_CA.iso88591 | 12 345,67 $ | (12 345,67 $) |
fy_NL.utf8 | € 12 345,67 | € 12 345,67- |
gu_IN.utf8 | ₹ 12,345.67 | -₹ 12,345.67 |
he_IL.iso88598 | שח 12,345.67 | שח 12,345.67- |
hr_HR.iso88592 | Kn 12 345,67 | -Kn 12 345,67 |
ht_HT.utf8 | 12 345,67 g | -12 345,67 g |
hu_HU.iso88592 | 12.345,67 Ft | -12.345,67 Ft |
hy_AM.utf8 | Դ12,345.67 | -Դ12,345.67 |
id_ID.iso88591 | Rp12.345,67 | -Rp12.345,67 |
is_IS.iso88591 | 12.346 kr | -12.346 kr |
ja_JP.eucjp | ¥12,346 | ¥-12,346 |
ka_GE.utf8 | ლ12.345,67 | -ლ12.345,67 |
kk_KZ.utf8 | 12 345.67 тг | -12 345.67 тг |
km_KH.utf8 | 12,345.67៛ | -12,345.67៛ |
ko_KR.euckr | ₩12,346 | ₩-12,346 |
ku_TR.iso88599 | 12.345,67 TL | -12.345,67 TL |
ky_KG.utf8 | 12 345.67 сом | -12 345.67 сом |
lg_UG.iso885910 | 12,345.67/- | -12,345.67/- |
lo_LA.utf8 | ₭ 12,345.67 | ₭ -12,345.67 |
lt_LT.iso885913 | 12.345,67 Lt | -12.345,67 Lt |
lv_LV.iso885913 | Ls 12 345,67 | -Ls 12 345,67 |
mg_MG.iso885915 | 12 345,67 AR | -12 345,67 AR |
mhr_RU.utf8 | 12 345.67 ТЕҤ | -12 345.67 ТЕҤ |
mk_MK.iso88595 | 12 345,67 ден | -12 345,67 ден |
mn_MN.utf8 | 12 345.67 ₮ | -12 345.67 ₮ |
ms_MY.iso88591 | RM12,345.67 | (RM12,345.67) |
mt_MT.iso88593 | 12,345.67EUR | (12,345.67EUR) |
mt_MT.utf8 | 12,345.67€ | (12,345.67€) |
my_MM.utf8 | 12,345.67Ks | -12,345.67Ks |
nan_TW.utf8@latin | NT$12,345.67 | -NT$12,345.67 |
nb_NO.iso88591 | kr12 345,67 | kr-12 345,67 |
ne_NP.utf8 | रू 12,345.67 | -रू 12,345.67 |
nl_AW.utf8 | Afl. 12 345,67 | Afl. 12 345,67- |
nl_BE.iso88591 | EUR 12 345,67 | EUR 12 345,67- |
nn_NO.iso88591 | kr 12 345,67 | -kr12 345,67 |
om_KE.iso88591 | Ksh12,345.67 | -Ksh12,345.67 |
os_RU.utf8 | 12 345.67 сом | -12 345.67 сом |
pa_PK.utf8 | Rs 12,345.67 | -Rs12,345.67 |
pap_AN.utf8 | f 12 345,67 | f 12 345,67- |
ps_AF.utf8 | 12٬346 افغانۍ | -12٬346 افغانۍ |
pt_BR.iso88591 | R$ 12.345,67 | -R$ 12.345,67 |
ro_RO.iso88592 | Lei 12.345,67 | -Lei 12.345,67 |
ru_RU.iso88595 | 12 345.67 руб | -12 345.67 руб |
ru_UA.koi8u | 12 345.67 гр | -12 345.67 гр |
rw_RW.utf8 | 12.345,67 Frw | -12.345,67 Frw |
sd_IN.utf8@devanagari | रु 12,345.67 | -रु 12,345.67 |
se_NO.utf8 | ru12.345,67 | ru-12.345,67 |
si_LK.utf8 | ₨ 12,345.67 | -₨ 12,345.67 |
sq_AL.iso88591 | Lek12.345,670 | -Lek12.345,670 |
sq_MK.utf8 | 12 345,67 den | -12 345,67 den |
sr_RS.utf8 | 12.345,67 дин | -12.345,67 дин |
sr_RS.utf8@latin | din 12.346 | -din 12.346 |
sv_SE.iso88591 | 12 345,67 kr | -12 345,67 kr |
sw_TZ.utf8 | TSh12,345.67 | -TSh12,345.67 |
te_IN.utf8 | ₹12,345.67 | -₹12,345.67 |
th_TH.utf8 | ฿ 12,345.67 | ฿ -12,345.67 |
tk_TM.utf8 | 12,345.67 MANAT | -12,345.67 MANAT |
tt_RU.utf8@iqtelif | 12 345.67 sum | -12 345.67 sum |
uk_UA.koi8u | 12 345,67грн. | -12 345,67 грн. |
uz_UZ.iso88591 | so'm12,345.67 | -so'm12,345.67 |
uz_UZ.utf8@cyrillic | сўм12,345.67 | -сўм12,345.67 |
vi_VN.utf8 | 12.346₫ | -₫12.346 |
wo_SN.utf8 | 12 345,67 CFA | -12 345,67 CFA |
yi_US.cp1255 | $ 12,345.67 | $ 12,345.67- |
CodePudding user response:
Based on people's comments, my answer is:
Don't
Either don't use the type, or at most use it as output-only, to not be parsed.