Home > Back-end >  How to robustly parse PostgreSQL money (in Python)
How to robustly parse PostgreSQL money (in Python)

Time:04-23

How can the money type that PostgreSQL offers be robustly parsed, to extract both the value and the currency symbol? (In Python, but something a bit non-language specific is also welcome)

The problems I think are that various components can change. e.g currency symbol can vary, as its position, as well as the symbol for what I would call decimal point, and maybe even the negative symbol...

Context: I'm writing a PostgreSQL adapter for Python, and wondering whether to parse money output, or leave it as a string.

Here is a list of all(?) 148 possible ways 12345.67 and -12345.67 can be format on my system based on the lc_monetary setting. (There are more lc_monetary possibilities but I've removed values that would duplicate output in this list)

lc_monetary 12345.67 -12345.67
aa_DJ.iso88591 $12 345.67 -$12 345.67
aa_ER.utf8 $ 12,346 -$ 12,346
aa_ET.utf8 $12,345.67 -$12,345.67
af_ZA.iso88591 R12,345.67 -R12,345.67
an_ES.iso885915 12.345,67 € -12.345,67 €
ar_AE.iso88596 د.إ. 12,345.670 د.إ. 12,345.670-
ar_BH.iso88596 د.ب. 12,345.670 د.ب. 12,345.670-
ar_DZ.iso88596 د.ج. 12,345.670 د.ج. 12,345.670-
ar_EG.iso88596 ج.م. 12,345.670 ج.م. 12,345.670-
ar_IN.utf8 ₹ 12,345.67 -₹ 12,345.67
ar_IQ.iso88596 د.ع. 12,345.670 د.ع. 12,345.670-
ar_JO.iso88596 د.أ. 12,345.670 د.أ. 12,345.670-
ar_KW.iso88596 د.ك. 12,345.670 د.ك. 12,345.670-
ar_LB.iso88596 ل.ل. 12,345.670 ل.ل. 12,345.670-
ar_LY.iso88596 د.ل. 12,345.670 د.ل. 12,345.670-
ar_MA.iso88596 د.م. 12,345.670 د.م. 12,345.670-
ar_OM.iso88596 ر.ع. 12,345.670 ر.ع. 12,345.670-
ar_QA.iso88596 ر.ق. 12,345.670 ر.ق. 12,345.670-
ar_SA.iso88596 12,345.67 ريال -12,345.67 ريال
ar_SD.iso88596 ج.س. 12,345.670 ج.س. 12,345.670-
ar_SY.iso88596 ل.س. 12,345.670 ل.س. 12,345.670-
ar_TN.iso88596 د.ت. 12,345.670 د.ت. 12,345.670-
ar_YE.iso88596 ر.ي. 12,345.670 ر.ي. 12,345.670-
ayc_PE.utf8 S/ 12,345.67 -S/ 12,345.67
az_AZ.utf8 12 345.67 man. -12 345.67 man.
be_BY.cp1251 12 345.67 руб -12 345.67 руб
be_BY.utf8@latin 12 345.67 Rub -12 345.67 Rub
bem_ZM.utf8 K12,345.67 -K12,345.67
ber_MA.utf8 ⴷ.ⵎ. 12,345.670 ⴷ.ⵎ. 12,345.670-
bg_BG.cp1251 12 345,67 лв -12 345,67 лв
bn_BD.utf8 ৳ 12,345.67 -৳ 12,345.67
bn_IN.utf8 ₹ 1,23,45.67 -₹ 1,23,45.67
bo_CN.utf8 ¥12,345.67 ¥-12,345.67
br_FR.iso88591 12 345,67 EUR -12 345,67 EUR
br_FR.iso885915@euro 12 345,67 € -12 345,67 €
bs_BA.iso88592 KM 12 345,67 -KM 12 345,67
ca_AD.iso885915 € 12.345,67 -€ 12.345,67
ca_ES.iso88591 EUR 12.345,67 -EUR 12.345,67
crh_UA.utf8 12 345.67 gr -12 345.67 gr
cs_CZ.iso88592 12 345,67 Kč -12 345,67 Kč
csb_PL.utf8 12.345,67 zł -12.345,67 zł
cv_RU.utf8 12 345.67 t -12 345.67 t
cy_GB.iso885914 £12,345.67 -£12,345.67
da_DK.iso88591 kr 12.345,67 kr -12.345,67
de_AT.iso88591 EUR 12 345,67 -EUR 12 345,67
de_AT.iso885915@euro € 12 345,67 -€ 12 345,67
de_BE.iso88591 EUR 12.345,67 EUR- 12.345,67
de_BE.iso885915@euro € 12.345,67 €- 12.345,67
de_CH.iso88591 Fr. 12'345.67 Fr.- 12'345.67
de_DE.iso88591 12.345,67 EUR -12.345,67 EUR
dv_MV.utf8 ރ. 12,345.67 -ރ.12,345.67
dz_BT.utf8 དངུལ་ཀྲམ་ 12,345.670 དངུལ་ཀྲམ་- 12,345.670
el_CY.iso88597 12.345,67€ -€12.345,67
en_BW.iso88591 Pu12,345.67 -Pu12,345.67
en_DK.iso88591 ¤12.345,67 -¤12.345,67
en_HK.iso88591 HK$12,345.67 (HK$12,345.67)
en_IE.iso88591 EUR12,345.67 -EUR12,345.67
en_IE.iso885915@euro €12,345.67 -€12,345.67
en_NG.utf8 ₦12,345.67 -₦12,345.67
en_PH.iso88591 Php12,345.67 (Php12,345.67)
en_SG.iso88591 $12,345.67 ($12,345.67)
en_ZW.iso88591 Z$12,345.67 -Z$12,345.67
es_AR.iso88591 $ 12.345,67 -$ 12.345,67
es_BO.iso88591 $b 12.345,67 -$b 12.345,67
es_CR.iso88591 C= 12 345,67 -C= 12 345,67
es_CR.utf8 ₡ 12 345,67 -₡ 12 345,67
es_CU.utf8 12 345,67 $ -12 345,67 $
es_DO.iso88591 $ 12,345.67 -$ 12,345.67
es_GT.iso88591 Q 12,345.67 -Q 12,345.67
es_HN.iso88591 L. 12,345.67 -L. 12,345.67
es_NI.iso88591 C$ 12,345.67 -C$ 12,345.67
es_PA.iso88591 B/ 12,345.67 -B/ 12,345.67
es_PY.iso88591 Gs. 12.345,67 -Gs. 12.345,67
es_SV.iso88591 C= 12,345.67 -C= 12,345.67
es_SV.utf8 ₡ 12,345.67 -₡ 12,345.67
es_VE.iso88591 Bs. 12.345,67 -Bs. 12.345,67
et_EE.iso88591 EUR 12 345,67 -EUR 12 345,67
et_EE.iso885915 € 12 345,67 -€ 12 345,67
eu_ES.iso885915@euro € 12.346 -€ 12.346
fa_IR.utf8 12٬346 ریال -12٬346 ریال
ff_SN.utf8 12,345.67 CFA -12,345.67 CFA
fi_FI.iso88591 12 345,67 EUR -12 345,67 EUR
fi_FI.iso885915@euro 12 345,67 € -12 345,67 €
fil_PH.utf8 PhP12,345.67 -PhP 12,345.67
fr_CA.iso88591 12 345,67 $ (12 345,67 $)
fy_NL.utf8 € 12 345,67 € 12 345,67-
gu_IN.utf8 ₹ 12,345.67 -₹ 12,345.67
he_IL.iso88598 שח 12,345.67 שח 12,345.67-
hr_HR.iso88592 Kn 12 345,67 -Kn 12 345,67
ht_HT.utf8 12 345,67 g -12 345,67 g
hu_HU.iso88592 12.345,67 Ft -12.345,67 Ft
hy_AM.utf8 Դ12,345.67 -Դ12,345.67
id_ID.iso88591 Rp12.345,67 -Rp12.345,67
is_IS.iso88591 12.346 kr -12.346 kr
ja_JP.eucjp ¥12,346 ¥-12,346
ka_GE.utf8 ლ12.345,67 -ლ12.345,67
kk_KZ.utf8 12 345.67 тг -12 345.67 тг
km_KH.utf8 12,345.67៛ -12,345.67៛
ko_KR.euckr ₩12,346 ₩-12,346
ku_TR.iso88599 12.345,67 TL -12.345,67 TL
ky_KG.utf8 12 345.67 сом -12 345.67 сом
lg_UG.iso885910 12,345.67/- -12,345.67/-
lo_LA.utf8 ₭ 12,345.67 ₭ -12,345.67
lt_LT.iso885913 12.345,67 Lt -12.345,67 Lt
lv_LV.iso885913 Ls 12 345,67 -Ls 12 345,67
mg_MG.iso885915 12 345,67 AR -12 345,67 AR
mhr_RU.utf8 12 345.67 ТЕҤ -12 345.67 ТЕҤ
mk_MK.iso88595 12 345,67 ден -12 345,67 ден
mn_MN.utf8 12 345.67 ₮ -12 345.67 ₮
ms_MY.iso88591 RM12,345.67 (RM12,345.67)
mt_MT.iso88593 12,345.67EUR (12,345.67EUR)
mt_MT.utf8 12,345.67€ (12,345.67€)
my_MM.utf8 12,345.67Ks -12,345.67Ks
nan_TW.utf8@latin NT$12,345.67 -NT$12,345.67
nb_NO.iso88591 kr12 345,67 kr-12 345,67
ne_NP.utf8 रू 12,345.67 -रू 12,345.67
nl_AW.utf8 Afl. 12 345,67 Afl. 12 345,67-
nl_BE.iso88591 EUR 12 345,67 EUR 12 345,67-
nn_NO.iso88591 kr 12 345,67 -kr12 345,67
om_KE.iso88591 Ksh12,345.67 -Ksh12,345.67
os_RU.utf8 12 345.67 сом -12 345.67 сом
pa_PK.utf8 Rs 12,345.67 -Rs12,345.67
pap_AN.utf8 f 12 345,67 f 12 345,67-
ps_AF.utf8 12٬346 افغانۍ -12٬346 افغانۍ
pt_BR.iso88591 R$ 12.345,67 -R$ 12.345,67
ro_RO.iso88592 Lei 12.345,67 -Lei 12.345,67
ru_RU.iso88595 12 345.67 руб -12 345.67 руб
ru_UA.koi8u 12 345.67 гр -12 345.67 гр
rw_RW.utf8 12.345,67 Frw -12.345,67 Frw
sd_IN.utf8@devanagari रु 12,345.67 -रु 12,345.67
se_NO.utf8 ru12.345,67 ru-12.345,67
si_LK.utf8 ₨ 12,345.67 -₨ 12,345.67
sq_AL.iso88591 Lek12.345,670 -Lek12.345,670
sq_MK.utf8 12 345,67 den -12 345,67 den
sr_RS.utf8 12.345,67 дин -12.345,67 дин
sr_RS.utf8@latin din 12.346 -din 12.346
sv_SE.iso88591 12 345,67 kr -12 345,67 kr
sw_TZ.utf8 TSh12,345.67 -TSh12,345.67
te_IN.utf8 ₹12,345.67 -₹12,345.67
th_TH.utf8 ฿ 12,345.67 ฿ -12,345.67
tk_TM.utf8 12,345.67 MANAT -12,345.67 MANAT
tt_RU.utf8@iqtelif 12 345.67 sum -12 345.67 sum
uk_UA.koi8u 12 345,67грн. -12 345,67 грн.
uz_UZ.iso88591 so'm12,345.67 -so'm12,345.67
uz_UZ.utf8@cyrillic сўм12,345.67 -сўм12,345.67
vi_VN.utf8 12.346₫ -₫12.346
wo_SN.utf8 12 345,67 CFA -12 345,67 CFA
yi_US.cp1255 $ 12,345.67 $ 12,345.67-

CodePudding user response:

Based on people's comments, my answer is:

Don't

Either don't use the type, or at most use it as output-only, to not be parsed.

  • Related