Skip to content

Conversation

@N0mikon
Copy link

@N0mikon N0mikon commented Jan 11, 2026

Summary

This PR fixes two related issues with recipe importing:

1. URL Import Fix (line ~2394)

Problem: "Import from URL" on /recipe/import page returns empty results (200 with no recipe data), while the bookmarklet import works fine for the same URLs.

Root Cause: The existing code used requests.get() to fetch page HTML, then passed it to scrape_html(). Many modern recipe sites (Food Network, AllRecipes, etc.) inject their JSON-LD schema data via JavaScript. Since requests.get() only fetches raw HTML without executing JavaScript, the schema data was missing and scrape_html() couldn't find any recipe data.

The bookmarklet worked because it captures document.documentElement.outerHTML from the browser - the fully-rendered DOM after JavaScript execution.

Fix: Use scrape_html(html=None, org_url=url, online=True, supported_only=False) which lets recipe_scrapers handle the HTTP request internally with:

  • Proper User-Agent handling
  • Site-specific request logic
  • Better compatibility with various recipe sites

2. Image Download Fix (line ~1657)

Problem: Recipe images failing to download from some CDNs and image hosting services.

Root Cause: The User-Agent was set to Firefox 86 (released February 2021), which some servers reject or serve different content to.

Fix: Updated to modern Chrome User-Agent with proper headers:

  • Modern Chrome 120 User-Agent string
  • Accept header with image MIME types
  • Accept-Language header
  • Referer header for sites that validate referrers

Test Plan

  • Tested URL import with AllRecipes, Food Network URLs
  • Verified bookmarklet still works
  • Tested image downloads from various CDNs
  • Built and ran develop branch with patches in Docker

Changes

  • cookbook/views/api.py: 2 locations modified (10 insertions, 7 deletions)

🤖 Generated with Claude Code

…patibility

URL Import Fix:
- Changed from requests.get() + scrape_html() to scrape_html(online=True)
- This lets recipe_scrapers handle HTTP requests with proper User-Agent,
  cookies, and site-specific logic
- Fixes imports from sites that inject JSON-LD schema via JavaScript,
  which was causing empty results while the bookmarklet worked fine

Image Download Fix:
- Updated User-Agent from outdated Firefox 86 (2021) to modern Chrome
- Added proper Accept headers for image content types
- Added Referer header for sites that validate referrers
- Improves compatibility with CDNs and image hosting services

Co-Authored-By: Claude <[email protected]>
@CLAassistant
Copy link

CLAassistant commented Jan 11, 2026

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants