I am trying to sift through some free text answers on geographical locations. As one of the steps, I want to check if the answer is any of the 290 municipalities in my country. As 290 entries would make my code cumbersome/hard to read I try saving them in an array, like below:
Data resorTEST;
keep R_res_ort_namn R_res_ort_txt R_kom_lan R_kommun;
set resor1TEST resor2TEST resor3TEST;
R_res_ort_namn=strip(lowcase(R_res_ort_namn));
R_res_ort_txt=strip(lowcase(R_res_ort_txt));
R_kom_lan=strip(lowcase(R_kom_lan));
array kommuner{290} $ ("upplands väsby" "vallentuna" "österåker" "värmdö"
"järfälla" "ekerö" "huddinge" "botkyrka" "salem"
"haninge" "tyresö" "upplands-bro" "nykvarn" "täby"
"danderyd" "sollentuna" "stockholm" "södertälje" "nacka" "sundbyberg"
"solna" "lidingö" "vaxholm" "norrtälje" "sigtuna" "nynäshamn"
"håbo" "älvkarleby" "knivsta" "heby" "tierp" "uppsala"
"enköping" "östhammar" "vingåker" "gnesta" "nyköping" "oxelösund" "flen"
"katrineholm" "eskilstuna" "strängnäs" "trosa" "ödeshög" "ydre"
"kinda" "boxholm" "åtvidaberg" "finspång" "valdemarsvik" "linköping"
"norrköping" "söderköping" "motala" "vadstena" "mjölby" "aneby"
"gnosjö" "mullsjö" "habo" "gislaved" "vaggeryd" "jönköping"
"nässjö" "värnamo" "sävsjö" "vetlanda" "eksjö" "tranås"
"uppvidinge" "lessebo" "tingsryd" "alvesta" "älmhult" "markaryd"
"växjö" "ljungby" "högsby" "torsås" "mörbylånga" "hultsfred"
"mönsterås" "emmaboda" "kalmar" "nybro" "oskarshamn" "västervik"
"vimmerby" "borgholm" "gotland" "olofström" "karlskrona" "ronneby"
"karlshamn" "sölvesborg" "svalöv" "staffanstorp" "burlöv" "vellinge"
"östra göinge" "örkelljunga" "bjuv" "kävlinge" "lomma" "svedala"
"skurup" "sjöbo" "hörby" "höör" "tomelilla" "bromölla"
"osby" "perstorp" "klippan" "åstorp" "båstad" "malmö"
"lund" "landskrona" "helsingborg" "höganäs" "eslöv" "ystad"
"trelleborg" "kristianstad" "simrishamn" "ängelholm" "hässleholm" "hylte"
"halmstad" "laholm" "falkenberg" "varberg" "kungsbacka" "härryda"
"partille" "öckerö" "stenungsund" "tjörn" "orust" "sotenäs"
"munkedal" "tanum" "dals-ed" "färgelanda" "ale" "lerum"
"vårgårda" "bollebygd" "grästorp" "essunga" "karlsborg" "gullspång"
"tranemo" "bengtsfors" "mellerud" "lilla edet" "mark" "svenljunga"
"herrljunga" "vara" "götene" "tibro" "töreboda" "göteborg"
"mölndal" "kungälv" "lysekil" "uddevalla" "strömstad" "vänersborg"
"trollhättan" "alingsås" "borås" "ulricehamn" "åmål" "mariestad"
"lidköping" "skara" "skövde" "hjo" "tidaholm" "falköping"
"kil" "eda" "torsby" "storfors" "hammarö" "munkfors"
"forshaga" "grums" "årjäng" "sunne" "karlstad" "kristinehamn"
"filipstad" "hagfors" "arvika" "säffle" "lekeberg" "laxå"
"hallsberg" "degerfors" "hällefors" "ljusnarsberg" "örebro" "kumla"
"askersund" "karlskoga" "nora" "lindesberg" "skinnskatteberg" "surahammar"
"kungsör" "hallstahammar" "norberg" "västerås" "sala" "fagersta"
"köping" "arboga" "vansbro" "malung-sälen" "gagnef" "leksand"
"rättvik" "orsa" "älvdalen" "smedjebacken" "mora" "falun"
"borlänge" "säter" "hedemora" "avesta" "ludvika" "ockelbo"
"hofors" "ovanåker" "nordanstig" "ljusdal" "gävle" "sandviken"
"söderhamn" "bollnäs" "hudiksvall" "ånge" "timrå" "härnösand"
"sundsvall" "kramfors" "sollefteå" "örnsköldsvik" "ragunda" "bräcke"
"krokom" "strömsund" "åre" "berg" "härjedalen" "östersund"
"nordmaling" "bjurholm" "vindeln" "robertsfors" "norsjö" "malå"
"storuman" "sorsele" "dorotea" "vännäs" "vilhelmina" "åsele"
"umeå" "lycksele" "skellefteå" "arvidsjaur" "arjeplog" "jokkmokk"
"överkalix" "kalix" "övertorneå" "pajala" "gällivare" "älvsbyn"
"luleå" "piteå" "boden" "haparanda" "kiruna");
/*if not missing(R_res_ort_namn) then R_kommun=prxchange("s/^.*-(.* kommun)/$1/",1,R_res_ort_namn);
else if prxmatch("/^.*([a-zA-Z]*? kommun).*$/",R_res_ort_txt) then R_kommun=prxchange("s/^.*?([a-zA-Z]*? kommun).*$/$1/",-1,R_res_ort_txt);
else if prxmatch("/^.*([a-zA-Z]*? kommun).*$/",R_kom_lan) then R_kommun=prxchange("s/^.*?([a-zA-Z]*? kommun).*$/$1/",-1,R_kom_lan);
else */if R_res_ort_txt in kommuner then R_kommun=R_res_ort_txt;
run;
However, for some reason this does not seem to work for all of the municipalities. The municipality of "uppsala" works for instance, but not the municipality of "ängelholm".
I have tried stripping the variables of whitespace and converting everything to lowercase. What am I doing wrong?
Additional info:
For some reason it does work flawlessly if I skip the array and just copy-paste the exact same list of municipality names into a parenthesis following the in-operator. I would however need to repeat this step 5-6 times and this solution would make my code quite cumbersome.
CodePudding user response:
You are defining the ARRAY
Array kommuner{290} $
with character length 8. See what happens when you fix that.