1
1
# PSpider
2
2
3
- A simple web spider frame written by Python, which needs Python3.6 +
3
+ A simple web spider frame written by Python, which needs Python3.8 +
4
4
5
5
### Features of PSpider
6
6
@@ -18,14 +18,14 @@ A simple web spider frame written by Python, which needs Python3.6+
18
18
### Procedure of PSpider
19
19
20
20
![ ] ( procedure.png )
21
- ①: Fetchers get url from UrlQueue , and make requests based on this url
22
- ②: Put the result(content ) of ① to HtmlQueue , and so Parser can get it
23
- ③: Parser gets content from HtmlQueue , and parses it to get new urls and item
24
- ④: Put the new urls to UrlQueue , and so Fetchers can get it
25
- ⑤: Put the item to ItemQueue , and so Saver can get it
26
- ⑥: Saver gets item from ItemQueue , and saves it to filesystem or database
27
- ⑦: Proxieser gets proxies from web or database, and puts proxies to ProxiesQueue
28
- ⑧: Fetcher gets proxies from ProxiesQueue if needed, and makes requests based on this proxies
21
+ ①: Fetchers get TaskFetch from QueueFetch , and make requests based on this task
22
+ ②: Put the result(TaskParse ) of ① to QueueParse , and so Parser can get task from it
23
+ ③: Parser gets task from QueueParse , and parses content to get new TaskFetchs and TaskSave
24
+ ④: Put the new TaskFetchs to QueueFetch , and so Fetchers can get task from it again
25
+ ⑤: Put the TaskSave to QueueSave , and so Saver can get task from it
26
+ ⑥: Saver gets TaskSave from QueueSave , and saves items to filesystem or database
27
+ ⑦: Proxieser gets proxies from web or database, and puts proxies to QueueProxies
28
+ ⑧: Fetcher gets proxies from QueueProxies if needed, and makes requests based on this proxies
29
29
30
30
### Tutorials of PSpider
31
31
0 commit comments