Python script, which scrapes the largest real estate website in Hungary. It extracts pieces of public information from website's source code.
Two languages:
- English
- Hungarian
Getting available real estate types on the website:
from real_estate_hungary import RealEstateHungarySettings, RealEstateHungaryPageListings
language='eng'
eng_settings=RealEstateHungarySettings(lang=language)
Listing types:
eng_settings.listing_types
['for-sale', 'for-rent']
Property types:
eng_settings.property_types
['apartment',
'house',
'land',
'garage',
'summer-resort',
'industrial',
'office',
'catering-unit',
'pension']
Setting up page parameters:
capital_of_hungary='budapest'
page_params={'real_estate_hun_settings':eng_settings,
'city':capital_of_hungary,
'listing_type':'for-sale',
'property_type':'apartment',
'page_num':1}
real_estates_on_page=RealEstateHungaryPageListings(**page_params)
print('Maximum number of pages: {:,}'.format(real_estates_on_page.max_page))
print('Maximum number of {property_type}s {listing_type} in {city}: {listings:,}'.format(**{**real_estates_on_page.params, 'listings':real_estates_on_page.max_listing}))
Maximum number of pages: 1,116
Maximum number of apartments for-sale in budapest: 13,382
Scrape all real estates on the given page:
listings_eng=real_estates_on_page.listings_to_df()
listings_eng.head()
property_url | city_district | lat | lng | building_material | condition_of_real_estate | area_size | price_in_eur | price_in_huf | convenience_level | desc | floors | orientation | ownership_status | type_of_heating | year_built |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
https://realestate.hu/... | Budapest, District XIII | 47.525306 | 19.068548 | Brick | Building in progress | 88 square meter | 209092 | 67430000 | NaN | Translated text... | 1st floor | Yard | NaN | In-house with unique meter | Newly built |
https://realestate.hu/... | Budapest, District III | 47.589843 | 19.065879 | Brick | Building in progress | 85 square meter | 204397 | 65915850 | NaN | Translated text... | Ground floor | NaN | NaN | In-house with unique meter | Newly built |
https://realestate.hu/... | Budapest, District V | 47.509952 | 19.053077 | Brick | Renovated | 83 square meter | 213650 | 68900000 | NaN | Translated text... | 3rd floor | Street front | NaN | Termosifone | 50+ years |
https://realestate.hu/... | Budapest, District VIII | 47.491388 | 19.070060 | Brick | Average | 101 square meter | 178610 | 57600000 | Modern convenience | Translated text... | NaN | NaN | NaN | Convector | NaN |
https://realestate.hu/... | Budapest, District XI | 47.48377 | 19.051580 | Brick | Good | 129 square meter | 369004 | 119000000 | Modern convenience | Translated text... | 4th floor | Panoramic | NaN | Termosifone | 50+ years |