Home > Mobile >  Matching the name of the companies to their index in Python
Matching the name of the companies to their index in Python

Time:09-17

I'm currently trying to match the companies' names to their index (value shown in parentheses). Here's what I have:

company_list
Out[31]: 
0                   BTECH (0011)ACEBRITE-TECH BERHAD
1                    EDEN (7471)MAINEDEN INC. BERHAD
2              GASMSIA (5209)MAINGAS MALAYSIA BERHAD
3      MALAKOF (5264)MAINMALAKOFF CORPORATION BERHAD
4       MFCB (3069)MAINMEGA FIRST CORPORATION BERHAD
5                     PBA (5041)MAINPBA HOLDINGS BHD
6               PETGAS (6033)MAINPETRONAS GAS BERHAD
7         RANHILL (5272)MAINRANHILL UTILITIES BERHAD
8                     SALCON (8567)MAINSALCON BERHAD
9     TALIWRK (8524)MAINTALIWORKS CORPORATION BERHAD
10              TENAGA (5347)MAINTENAGA NASIONAL BHD
11              YTL (4677)MAINYTL CORPORATION BERHAD
12     YTLPOWR (6742)MAINYTL POWER INTERNATIONAL BHD
Name: Company, dtype: object

Now when I retrieve the historical stock prices using yfinance, it helps me to arrange the companies' index in ascending order:

import re
import yfinance as yf

def sto_data(start, end, adj_close_fname):
    sto_code = []
    for i in range(len(company_list)):
        temp = re.findall(r'\d ', company_list[i])
        res = "".join(temp)
        sto_code.append(f'{res}.KL')
    stocks_data = yf.download(sto_code,start=start, end=end)
    stocks_adj_close = stocks_data['Adj Close']
    stocks_adj_close.to_csv(adj_close_fname)
    return stocks_adj_close

Here are what is saved into the CSV file:

Date,0011.KL,3069.KL,4677.KL,5041.KL,5209.KL,5264.KL,5272.KL,5347.KL,6033.KL,6742.KL,7471.KL,8524.KL,8567.KL
2020-01-02,0.21262522041797638,2.504232406616211,0.9967740178108215,1.0405075550079346,2.516713857650757,0.7837547063827515,0.9970082640647888,12.046622276306152,15.702932357788086,0.7417479157447815,0.2150000035762787,0.7533007264137268,0.23168399930000305
2020-01-03,0.21735022962093353,2.5091044902801514,0.9774190187454224,1.0405075550079346,2.562472105026245,0.7792503833770752,0.959027886390686,12.010391235351562,15.888110160827637,0.769220769405365,0.20499999821186066,0.7575327157974243,0.22695599496364594
2020-01-06,0.21735022962093353,2.504232406616211,0.9435480237007141,1.050053596496582,2.580775499343872,0.7792503833770752,0.9400370717048645,11.865468978881836,15.628861427307129,0.7646419405937195,0.20999999344348907,0.7575327157974243,0.2127709984779358

As we can see from the results above, the company_list and the CSV file do not have the same order according to the index. I hope to know how can I match the companies to their index as saved in the CSV file so that I can have this output:

0011.KL - ACEBRITE-TECH BERHAD
3069.KL - MAINMEGA FIRST CORPORATION BERHAD
...

CodePudding user response:

You could specify the column names as follows:

Code:

from io import StringIO
import pandas as pd
import re
import yfinance as yf

def sto_data(start, end, company_list, adj_close_fname = None):
    stock_codes = []
    company_names = []
    for company_str in company_list:
        res = re.findall(r'\d ', company_str)[0]
        stock_code = f"{res}.KL"
        stock_codes.append(stock_code)
        company_name = re.findall(r"\d \)(.*)", company_str)[0]
        company_names.append(company_name)
        
    stocks_data = yf.download(stock_codes,start=start, end=end, show_errors=False)
    stocks_adj_close = stocks_data["Adj Close"]
    
    stocks_adj_close = stocks_adj_close[stock_codes]
    if None is not adj_close_fname:
        stocks_adj_close.to_csv(adj_close_fname)
    return stocks_adj_close

if __name__ == '__main__':
    compan_list_data = """
index    name
0                   BTECH (0011)ACEBRITE-TECH BERHAD
1                    EDEN (7471)MAINEDEN INC. BERHAD
2              GASMSIA (5209)MAINGAS MALAYSIA BERHAD
3      MALAKOF (5264)MAINMALAKOFF CORPORATION BERHAD
4       MFCB (3069)MAINMEGA FIRST CORPORATION BERHAD
5                     PBA (5041)MAINPBA HOLDINGS BHD
6               PETGAS (6033)MAINPETRONAS GAS BERHAD
7         RANHILL (5272)MAINRANHILL UTILITIES BERHAD
8                     SALCON (8567)MAINSALCON BERHAD
9     TALIWRK (8524)MAINTALIWORKS CORPORATION BERHAD
10              TENAGA (5347)MAINTENAGA NASIONAL BHD
11              YTL (4677)MAINYTL CORPORATION BERHAD
12     YTLPOWR (6742)MAINYTL POWER INTERNATIONAL BHD"""
    df_company_list = pd.read_csv(StringIO(compan_list_data), sep="\s{2,}", engine="python")
    company_list = df_company_list["name"].values
    df = sto_data("2021-09-14", "2021-09-15", company_list)
    print(df)

Result:

            0011.KL  7471.KL  5209.KL  5264.KL  3069.KL  5041.KL    6033.KL  5272.KL  8567.KL  8524.KL  5347.KL  4677.KL  6742.KL
Date                                                                                                                             
2021-09-13    0.440    0.145     2.72     0.87     3.50     0.84  16.700001    0.705    0.225    0.830    10.28    0.655    0.715
2021-09-14    0.435    0.145     2.72     0.87     3.52     0.83  16.760000    0.705    0.220    0.835    10.18    0.650    0.715

CodePudding user response:

At a quick glance it looks like your order may not be what you're expecting because you are appending to the new list sto_code. Also are you sure you're trying to return stocks_adj_close?

  • Related