Skip to content

Conversation

@adevyish
Copy link

@adevyish adevyish commented Dec 6, 2018

  • Fix exception if tag has / in it
  • Fix tag links not working

- Fix exception if tag has `/` in it
- Fix tag links not working
@bbolli
Copy link
Owner

bbolli commented Dec 6, 2018

That's great, thanks! Does slugify() also work with Windows?
And, seeing that you don't need the first tuple element, it should be removed completely.

@adevyish
Copy link
Author

adevyish commented Dec 6, 2018

I think it should but I don’t have access to a testing environment. Can fix up the tuple thing tomorrow but I was trying to get it working quickly 😅

tumblr_backup.py Outdated
self.fixup_media_links()
tag_index = [self.blog.header('Tag index', 'tag-index', self.blog.title, True), '<ul>']
for tag, index in sorted(self.tags.items(), key=lambda kv: kv[1].name):
for _, index in sorted(self.tags.items(), key=lambda kv: kv[1].name):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use tags.values() to get just the values instead of tuples

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I’m aware of this.

@adevyish
Copy link
Author

adevyish commented Dec 7, 2018

As a sidenote, I didn't want to add an additional dependency but unicode-slugify also handles if you want to slug with non-ascii characters (which I'm doing for my own archive, since I have plenty of CJK tags)

@aspensmonster
Copy link

It seems like tumblr tags are a wonderful example of diverse user input that's always trying to outsmart the slug code. CJK tags, tags with slashes, tags with all kinds of odd unicode, emoji, multiple tags condensing down to one slug (whose sets aren't identical, so the rendered HTML is incomplete)...

Assuming you don't mind too much what the folder name is --if your use of the backup is mostly using the rendered HTML-- this approach seems to work for me for various weird tags (haven't found a broken or empty link yet, though there are thousands of tags in my backup):

import hashlib
...
tag_index = [self.blog.header('Tag index', 'tag-index', self.blog.title, True), '<ul>']
for index in sorted(self.tags.values(), key=lambda v: v.name):
    tag = hashlib.sha256(index.name.encode('utf-8')).hexdigest()
    etc etc etc

I'm also pretty sure hashlib is part of the standard library, so no additional module install is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants