Use multiple threads for generating the content to profit from multi-core CPUs

I don't have any experience with multiple threads in Python.

I think that it should not be that hard to split the (HTML) file generation part into multiple threads: data is collected and static, articles can be written independently from each other after the populate functions (must check generation of directories though and if not some data structures gets populated along the way). 

Here's the output of my current blog generation: 

```
INFO     • Parsing Org mode files …                                                                                
INFO     Parsed 22 Org-mode files with 1031079 lines (in 1.44 seconds)
INFO     • Generating articles …                                                                                   
INFO     • Building index of files …
INFO     Built index for 518638 files (in 1.64 seconds)                                                            
INFO     Generated 827 articles: 41 persistent, 706 temporal, 79 tag-pages, the entry page, and scaled 0 images (in 80.60 seconds)
```

As it seems, there's not much to gain in the parsing section as it is fairly fast. However, the generating phase is the significant duration here.

With no particular knowledge, I'd guess that moving the "generate" functions into threads doesn't scale much since they only generate one single entry and the threading overhead might add significant time here.

I guess that running a profiler would be the best way to determine which parts to move into threads. I don't have experience with this either. Without doing the profiler analysis, I'd maybe split up into those threads in https://github.com/novoid/lazyblorg/blob/master/lib/htmlizer.py:

- the general pages: entry page, tag cloud, ...
- all instances of:
  - _scale_and_write_image_file()
  - _copy_image_file_without_exif()

Maybe you do have experience and you are able to run a quick test of this is a quick win task?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use multiple threads for generating the content to profit from multi-core CPUs #115

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Use multiple threads for generating the content to profit from multi-core CPUs #115

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions