I am learning classes in python, and I have two methods of webscraping a website -namely, the functions gets the urls to paginate through. One way is written via a class Method and the other is just a straight up function. I am confused, they are both working and do the same output, but I am confused which way is more pythonic and efficient?
Using Class
class Get_URL:
def __init__(self,city,price_max,price_min, bedrm_min, bath_min):
if price_max and price_min != None and price_max <= price_min:
raise ValueError
self.url = f'&for_sale=1&quicksearch={city}&listing_price_max={price_max}&listing_price_min={price_min}&bedroom_min={bedrm_min}&full_bath_min={bath_min}&property_class_id=1,2,6,4'
while price_max == None:
self.url = self.url.replace(f'&listing_price_max={price_max}', '')
break
while price_min == None:
self.url = self.url.replace(f'&listing_price_min={price_min}', '')
break
while bedrm_min == None:
self.url = self.url.replace(f'&bedroom_min={bedrm_min}', '')
break
while bath_min == None:
self.url = self.url.replace(f'&full_bath_min={bath_min}', '')
break
def get_urls(self):
self.url_base = 'https://www.har.com/search/dosearch?page='
self.url_lst = []
for number in range(1,21):
new_url = f'{self.url_base}{number}{self.url}'
self.url_lst.append(new_url)
Output:
query1 = Get_URL('Houston', 100000,50000,None, None)
query1.get_urls()
query1.url_lst
['https://www.har.com/search/dosearch?page=1&for_sale=1&quicksearch=Houston&listing_price_max=100000&listing_price_min=50000&property_class_id=1,2,6,4',
'https://www.har.com/search/dosearch?page=2&for_sale=1&quicksearch=Houston&listing_price_max=100000&listing_price_min=50000&property_class_id=1,2,6,4',
'https://www.har.com/search/dosearch?page=3&for_sale=1&quicksearch=Houston&listing_price_max=100000&listing_price_min=50000&property_class_id=1,2,6,4',
:
:
:
]
Using User Defined Function
def get_houses(city, price_max, price_min,bedrm_min, bath_min):
# raise error if price max less than price min
if price_max and price_min != None and price_max <= price_min:
raise ValueError
# define url
page = 1
url = f'&for_sale=1&quicksearch={city}\
&listing_price_max={price_max}\
&listing_price_min={price_min}\
&bedroom_min={bedrm_min}\
&full_bath_min={bath_min}\
&property_class_id=1,2,6,4'
while price_max == None:
url = url.replace(f'&listing_price_max={price_max}', '')
break
while price_min == None:
url = url.replace(f'&listing_price_min={price_min}', '')
break
while bedrm_min == None:
url = url.replace(f'&bedroom_min={bedrm_min}', '')
break
while bath_min == None:
url = url.replace(f'&full_bath_min={bath_min}', '')
break
# Get URL List
url_lst = []
for number in range(1,21):
url_base = f'https://www.har.com/search/dosearch?page={number}'
url_lst.append(url_base url)
return(url_lst)
Outputs:
get_houses('Houston', 100000,50000,None, None)
['https://www.har.com/search/dosearch?page=1&for_sale=1&quicksearch=Houston&listing_price_max=100000&listing_price_min=50000&property_class_id=1,2,6,4',
'https://www.har.com/search/dosearch?page=2&for_sale=1&quicksearch=Houston&listing_price_max=100000&listing_price_min=50000&property_class_id=1,2,6,4',
'https://www.har.com/search/dosearch?page=3&for_sale=1&quicksearch=Houston&listing_price_max=100000&listing_price_min=50000&property_class_id=1,2,6,4',
:
:
:
]
CodePudding user response:
For your current project, functional programming works well and it is easy to develop. But Object Oriented Programming (using classes) have many advantages, especially for a larger project or a project that will be developed or re-used in the future. Here is some pros [1]:
- Modularity for easier troubleshooting
- Reuse of code through inheritance
- Flexibility
1: 4 Advantages of Object-Oriented Programming
CodePudding user response:
Im not an authority on the subject, but the difference, in my opinion, is in the use case. Use what you need.
If you are creating something small and simple, why bother with OOP and classes?
On the other hand, if you are thinking of starting a project that you know will be big and complex, or maybe you just want to give yourselve this oportunity to easily add and extend the code, then classes might be a way to go.
Having said that, I would look into OOP and its principles and see exactly what that is about. It can be complex for a beginer, however, using right tools for the job is usualy a way to go.
Hope this helps, if you have any more questions just ask.