I am trying to find out the length of Hindi words in Python, like 'प्रवीण' has length of 3 as per my knowledge.
w1 = 'प्रवीण'
print(len(w1))
I tried this code but it didn't work.
CodePudding user response:
In the Hindi language, each character need not be of length one as is in English. For example, वी
is not one character but rather two characters combined into one:
- व
- ी
So in your case, the word प्रवीण
is not of length 3 but rather 6.
w1 = "प्रवीण"
for w in w1:
print(w)
And the output would be
प
्
र
व
ी
ण
CodePudding user response:
As @betelgeuse has said, Hindi does not function the way you think it does. Here's some pseudocode (working) to do what you expect though:
w1 = 'प्रवीण'
def hindi_len(word):
hindi_letts = 'कखगघङचछजझञटठडढणतथदधनपफबभमक़ख़ग़ज़ड़ढ़फ़यरलळवहशषसऱऴअआइईउऊऋॠऌॡएऐओऔॐऍऑऎऒ'
# List of hindi letters that aren't halves or mantras
count = 0
for i in word:
if i in hindi_letts:
count = 1 if word[word.index(i) - 1] != '्' else 0 # Make sure it's not a half-letter
return count
print(hindi_len(w1))
This outputs 3
. It's up to you to customize it as you'd like, though.
Edit: Make sure you use python 3.x or prefix Hindi strings with u
in python 2.x, I've seen some language errors with python 2.x non-unicode encoding somewhere before...