Feature request: Population of common fields for Resources Type: Articles, Blogs and Updates  & News Article

**Is your feature request related to a problem? Please describe.**
Resources Type: Articles, Blogs and Updates , some info & metadata is often already available in the published article and can be extracted via the use of some libraries

**Describe the solution you'd like**
 if the metadata is present in the article,  i would like some of the content to be autofilled or given an option to choose to fill, directly or indirectly in plone or via REST API

- Default Tab:
  - Title: <Site/Publisher name> - Article Title
  - Summary: Article Summary
  - Lead Image: Article look based on python generated screenshot
  - Lead Image caption: Caption from article or LLAVA based description of image 
  - Text: <clean article text> OR <rich html text via manual copy paste?>
  - Resource Type: "Articles, Blogs and Updates" (may apply for "News Article" too and possibly "Newsletter, Journal", "Press Statement or News Release" if its website based)

- Ownership Tab:
  - Contributors: Article Author
  - Rights: Copyright information

- Dates Tab:
  - Publishing Date: Get from site/article Publishing Date
  - Rights: Copyright information

- Categorization Tab:
  - Countries: <default to Malaysia?) or <detect based on URL domain or article content?> or leave blank & already mentioned in[ #3 ](https://github.com/Sinar/sinar.resource/issues/3)
  - SDG Goals: <set to main SDG according to Article Text content maybe based on library such as Seesus or HuggingFace>
  - Development Themes: <use ML or LLM to help categorize or recommend> or leave blank
  
 - Partners Tab:
  - Accountable: already mentioned in [ #3 ](https://github.com/Sinar/sinar.resource/issues/3) or leave blank if unsure
  - Implementing Partners:  already mentioned in[ #3 ](https://github.com/Sinar/sinar.resource/issues/3) or leave blank if unsure

**Describe alternatives you've considered**
As the current flow of webbased for Adding Resource, may not facilitate the data entry of metadata without signiifcant redesign of workflow, **a Jupyter Notebook that will invoke the necessary metadata is also a possible option and input into as new Resource** 

**Additional context**
- Article metadata extraction can use newspaper3k python lib
- Article clean text might be better extracted using trafilatura python lib
- Image - screenshot can use [playwright](https://www.zenrows.com/blog/playwright-screenshot#screenshot-visible-parts) or something like [witnessme](https://github.com/byt3bl33d3r/WitnessMe?tab=readme-ov-file#screenshot-mode)
- Image caption can use tesseract, or LLAVA
- SDG detection can use seesu python lib
- Partners detection relies on the part describe in [ #3 ](https://github.com/Sinar/sinar.resource/issues/3) 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature request: Population of common fields for Resources Type: Articles, Blogs and Updates & News Article #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature request: Population of common fields for Resources Type: Articles, Blogs and Updates & News Article #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions