Skip to content

Commit cc9853f

Browse files
committed
Add XPath scraper for Amateur Allure Classics
1 parent ba64e3e commit cc9853f

File tree

1 file changed

+35
-7
lines changed

1 file changed

+35
-7
lines changed

scrapers/AmateurAllure.yml

Lines changed: 35 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,20 @@ galleryByURL:
33
- action: scrapeXPath
44
url: &urls
55
- amateurallure.com/tour/scenes/
6-
scraper: galleryScraper
6+
scraper: amateurAllure
7+
- action: scrapeXPath
8+
url: &classicUrls
9+
- amateurallureclassics.com/scenes/
10+
scraper: amateurAllureClassics
711
sceneByURL:
812
- action: scrapeXPath
913
url: *urls
10-
scraper: sceneScraper
14+
scraper: amateurAllure
15+
- action: scrapeXPath
16+
url: *classicUrls
17+
scraper: amateurAllureClassics
1118
xPathScrapers:
12-
galleryScraper:
19+
amateurAllure:
1320
common: &commonAttr
1421
$sceneinfo: //div[@class="scene-info"]
1522
$title: //span[@class='title_bar_hilite']
@@ -30,8 +37,6 @@ xPathScrapers:
3037
Studio: &studioAttr
3138
Name:
3239
fixed: Amateur Allure
33-
sceneScraper:
34-
common: *commonAttr
3540
scene:
3641
Title: *titleSel
3742
Code: *id
@@ -59,7 +64,30 @@ xPathScrapers:
5964
- replace:
6065
- regex: ^([^|]+amateurallure[^|]+)\|.+(/content/contentthumbs/\d+/\d+/[^/]+\.jpg) 1920w
6166
with: $1$2
62-
- regex: 1x
67+
- regex: "[124]x"
6368
with: "full"
6469
Studio: *studioAttr
65-
# Last Updated May 01, 2024
70+
amateurAllureClassics:
71+
common:
72+
$scene: //div[contains(@class, "gallery_info")]
73+
$excludeUpdates: not(ancestor::*[contains(@class, "category_listing_block")])
74+
scene:
75+
Title: //title
76+
Date:
77+
# Some sites hide their release date in a comment
78+
selector: //*[(contains(@class, "availdate") or contains(@class, "update_date")) and contains(., "/")]
79+
postProcess:
80+
- replace:
81+
- regex: ".*?([0-9]{2}/[0-9]{2}/[0-9]{4}).*"
82+
with: $1
83+
- parseDate: 01/02/2006
84+
Details: $scene//span[@class="update_description"]
85+
Performers:
86+
Name: $scene//span[@class="update_models" and $excludeUpdates]/a
87+
Tags:
88+
Name: $scene//span[contains(@class, "update_tags")]/a
89+
Studio:
90+
Name:
91+
fixed: Amateur Allure Classics
92+
Image: //meta[@property="og:image"]/@content
93+
# Last Updated September 27, 2024

0 commit comments

Comments
 (0)