Skip to content

Commit f47d126

Browse files
committed
4.7.0 release
1 parent 549e519 commit f47d126

File tree

2 files changed

+170
-1
lines changed

2 files changed

+170
-1
lines changed

site/blog/soupault-4.7.0-release.md

+167
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
<h1 id="post-title">Soupault 4.7.0 release: CSV support, global shared data, post-build hook, and more</h1>
2+
3+
<p>Date: <time id="post-date">2023-09-19</time> </p>
4+
5+
<p id="post-excerpt">
6+
Soupault 4.7.0 is available for download from <a href="https://files.baturin.org/software/soupault/4.7.0">my own server</a>
7+
and from <a href="https://github.com/PataphysicalSociety/soupault/releases/tag/4.7.0">GitHub releases</a>.
8+
It adds support for loading CSV files, a variable for passing global data between plugins and hooks,
9+
a way to determine which two-pass workflow pass is a plugin is executed for, and a few more improvements.
10+
</p>
11+
12+
## Configurable page character encoding
13+
14+
By default, soupault assumes that all pages are stored in UTF-8. I would encourage everyone to migrate to it,
15+
now that all operating systems use it by default. But there are certainly sites that are older than the
16+
widespread deployment of UTF-8, and there are tools that still produce legacy encodings as well.
17+
18+
Now it's possible to specify the encoding explicitly for such cases:
19+
20+
```toml
21+
[settings]
22+
page_character_encoding = 'utf-8'
23+
```
24+
25+
The following encodings are supported: `ascii`, `iso-8859-1`, `windows-1251`, `windows-1252`, `utf-8`,
26+
`utf-16`, `utf-16le`, `utf-16be`, `utf-32le`, `utf-32be`, and `ebcdic`.
27+
You can write those options in either upper or lower case (e.g., `UTF-16LE`, `UTF-16le`, and `utf-16le`
28+
are equally acceptable). You cannot omit hyphens or replace them with underscores, though.
29+
30+
## Plugin support for the two-pass workflow
31+
32+
Soupault supports a [two-pass workflow](/reference-manual/#making-index-data-available-to-every-page)
33+
that allows users to make the index data available to all pages (even to content pages).
34+
35+
That feature comes at the cost of duplicating some of the page processing work (at the very least, HTML parsing
36+
and index extraction), but enables use cases that would be impossible otherwise.
37+
For example, the [book blueprint](https://github.com/PataphysicalSociety/soupault-blueprints-book)
38+
uses that capability to inject a fully auto-generated chapter list sidebar in every page,
39+
while its main competitor, [mdBook](https://rust-lang.github.io/mdBook/), requires a hand-written chapter list.
40+
41+
However, until this release, plugins could only guess where soupault was in its website build process,
42+
e.g., by checking if the `site_index` table was empty. That approach is not foolproof and absolutely not flexible.
43+
44+
Now there's a new `soupault_pass` plugin environment variable: 0 when `index_first = false`, 1 and 2 for the first and the second pass respectively when it's true.
45+
Thus plugins can check if the two-pass workflow enabled at all and find out which pass is it.
46+
47+
```lua
48+
if soupault_pass < 2 then
49+
-- Do nothing
50+
else
51+
-- Do things that require index data
52+
end
53+
```
54+
55+
## Global data shared between all plugins and hooks
56+
57+
There was already `peristent_data` variable that plugins could use to preserve data — for example,
58+
to calculate the total reading time of all pages and output it on a specific page.
59+
60+
However, there was no way for plugins and hooks to share any data. For example, suppose you want to profile
61+
your website build and measure the time it takes to build each page. You could call `Date.now_timestamp()`
62+
in `pre-parse` and `post-save` hooks, then subtract the start time from the end time... but where would you store
63+
that data to make it available to both hooks? Technically, you could inject it in the page,
64+
but that's a rather dirty hack.
65+
66+
Now there's a new variable named `global_data` that allows different plugins and hooks to communicate
67+
without any dirty hacks. You could just do something like `global_data["start_time"] = Date.now_timestamp()`
68+
in the `pre-parse` hook and access it from the `post-render` hook easily.
69+
70+
This feature certainly comes at the cost of making soupault process pages in parallel harder in the future.
71+
Making soupault use more than one worker thread is now blocked by the fact that Lua-ML, the Lua interpreter it uses,
72+
it neither reentrant nor thread-safe and needs a deep refactoring to make it so. When that part is done,
73+
there will be more questions about the right design for multi-core soupault workflows, but that's a question for the future.
74+
75+
## CSV support
76+
77+
Soupault can already load JSON, TOML, and YAML data files. However, what if you want to create a website
78+
for a product catalog for a small store? A lot of data is kept in spreadsheets or local databases,
79+
and the most common export format for such data is CSV.
80+
81+
Now soupault supports loading CSV files, but that's not all — it can also convert CSV data with a correct header
82+
to a list of objects that you can easily pass to a template for rendering.
83+
84+
These are the new functions:
85+
86+
* `CSV.from_string(str)` — parses CSV data and returns it as a list (i.e., an int-indexed table) of lists.
87+
* `CSV.unsafe_from_string(str)` — like `CSV.from_string` but returns `nil` on errors instead or raising an exception.
88+
* `CSV.to_list_of_tables(csv_data)` — converts CSV data with a header returned by `CSV.from_string` into a list of string-indexed tables for easy rendering.
89+
90+
Now let's look at the `CSV.to_list_of_tables` function in action. Let's write a Lua snippet with a CSV data embedded in it for demonstration:
91+
92+
```lua
93+
csv_source = [[name,price,comment
94+
baby shoes,5,never worn
95+
fake amulet of Yendor,1,uncursed
96+
]]
97+
98+
csv_data = CSV.from_string(csv_source)
99+
Log.debug(format("Raw CSV data: %s", JSON.pretty_print(csv_data)))
100+
csv_table = CSV.to_list_of_tables(csv_data)
101+
Log.debug(format("Converted CSV data: %s", JSON.pretty_print(csv_table)))
102+
```
103+
104+
If you add it to a plugin and run soupault, you will see the following output:
105+
106+
```
107+
[INFO] Processing widget csv-test on page site/index.html
108+
[DEBUG] Raw CSV data: [
109+
[
110+
"name",
111+
"price",
112+
"comment"
113+
],
114+
[
115+
"baby shoes",
116+
5,
117+
"never worn"
118+
],
119+
[
120+
"fake amulet of Yendor",
121+
1,
122+
"uncursed"
123+
]
124+
]
125+
126+
[DEBUG] Converted CSV data: [
127+
{
128+
"price": 5,
129+
"comment": "never worn",
130+
"name": "baby shoes"
131+
},
132+
{
133+
"price": 1,
134+
"comment": "uncursed",
135+
"name": "fake amulet of Yendor"
136+
}
137+
]
138+
```
139+
140+
As you can see, the "converted CSV data" can be directly passed to a template like this:
141+
142+
```jinja2
143+
{% for i in items %}
144+
Item {{i.name}} ({{i.comment}} is sold for {{i.price}}.
145+
{% endfor %}
146+
```
147+
148+
## Other new features and improvements
149+
150+
* New `max_items` option in index views allows limiting the number of displayed items.
151+
* New `post-build` hook that runs when all pages are processed and soupault is about to terminate.
152+
* Info logs to indicate the first and second passes in the `index_first = true` mode.
153+
* Debug logs now tell why a page is included or excluded from an index view: `"page_included checks for %s: regex=%b, page=%b, section=%b"`
154+
155+
### Other new plugin API functions
156+
157+
* `HTML.swap(l, r)` — swaps two elements in an element tree.
158+
* `HTML.wrap(node, elem)` — wraps `node` in `elem`.
159+
160+
## Bug fixes
161+
162+
* Fixed an unhandled exception on index entry sorting failures when `sort_strict = true` and `sort_by` is unspecified.
163+
* Fixed a typo in the comments of the config generated by `soupault --init` (s/ULRs/URLs/).
164+
165+
## Platform support
166+
167+
Official binaries are now available for Linux on ARM64 (e.g., RaspberryPi 3 and 4).

soupault.toml

+3-1
Original file line numberDiff line numberDiff line change
@@ -28,13 +28,14 @@
2828
[custom_options]
2929
site_url = "https://soupault.app"
3030

31-
latest_soupault_version = "4.6.0"
31+
latest_soupault_version = "4.7.0"
3232

3333
[index]
3434
index = true
3535

3636
sort_descending = true
3737
sort_by = "date"
38+
#strict_sort = true
3839

3940
extract_after_widgets = ["insert-reading-time"]
4041
strip_tags = false
@@ -266,3 +267,4 @@ page_source = Regex.replace_all(page_source, "\\$SOUPAULT_RELEASE\\$", soupault_
266267
feed_title = "soupault"
267268
feed_subtitle = "A static website generator and programmable HTML processor"
268269
feed_logo = "https://soupault.app/images/soupault_stick_horse.png"
270+

0 commit comments

Comments
 (0)