Skip to content

Discussion on implementing selectolax support  #239

Open
@deepakdinesh1123

Description

@deepakdinesh1123

Here are some of the changes I thought of implementing

High level changes -

  1. Selector class takes a new argument "parser" which indicates which parser backend to use (lxml or selectolax).
  2. Selectolax itself provides two backends Lexbor and Modest by default it uses the Modest backend. Should additional support for lexbor be added? We could use modest by default and have the users pass an argument if they want to use lexbor
  3. If the "parser" argument is not provided lxml will be used by default, since I thought it preserves the current behavior and allows backward support. It also allows the test suite to be used without changes to all the existing methods.
  4. If the xpath method is called on a selector instantiated with selectolax as parser raise NotImplementedError.

Low level changes -

  1. Add selectolax to the list of parsers in _ctgroup and modify create_root_node to instantiate the selected parser with the provided data.
  2. Modify the xpath and css methods behavior to use both selectolax and lxml or write separate methods or classes to handle them.
  3. Utilize HTMLParser class in Selectolax and its css method to apply the css expression specified and return the data collected.
  4. Create a Selectorlist with Selector objects created with the type and parser specified.

This is still a work in progress and I will make a lot of changes, Please suggest the changes that need to made to the current list

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions