Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: new DocTypes "Code List" and "Common Code" #43425

Open
wants to merge 38 commits into
base: develop
Choose a base branch
from

Conversation

barredterra
Copy link
Collaborator

@barredterra barredterra commented Sep 29, 2024

Bildschirmfoto 2024-09-29 um 13 26 10 Bildschirmfoto 2024-09-29 um 13 41 20
In [5]: get_code_for(code_list, "Account", "Tax Assets - WPL")
Out[5]: 'Z'

In [6]: get_doc_for(code_list, "Account", "Z")
Out[6]: 'Tax Assets - WPL'

Screenshots based on https://www.xrepository.de/api/xrepository/urn:xoev-de:kosit:codeliste:untdid.5305_3:technischerBestandteilGenericode:datei:UNTDID_5305-3.xml

After this is merged, check if we should revert #40593 and #40663

no-docs (for now)

@barredterra
Copy link
Collaborator Author

barredterra commented Sep 30, 2024

  • It is not obvious (and was not intended by me) that the CanonicalVersionUri now has to be used as the name of the code list (for the import to work). E.g. I would not expect the CanonicalVersionUri of the code list linked above to be referenced in most eInvoices. So a more generic name would be useful for retrieving the necessary code list. In the screenshot above, I used the Identification/ShortName.
  • "Additional Data" from the code list is not yet imported (e.g. descriptions, as available in the code list linked above).

Ensure the same reference is not used on a different Common Code of the same Code List.
@blaggacao
Copy link
Collaborator

More exploration from my side: barredterra#1 (Edi Template & Edi Log, squashed your naming fix
there, @barredterra )

@blaggacao
Copy link
Collaborator

blaggacao commented Sep 30, 2024

It is not obvious (and was not intended by me) that the CanonicalVersionUri now has to be used as the name of the code list (for the import to work). E.g. I would not expect the CanonicalVersionUri of the code list linked above to be referenced in most eInvoices.

Firstly, most of the time, however, unlike the common codes, these lists are akin to real constants in an e-invoice template. I would typically hard-code them for each distinct edi template. Here is an example:

XML Example
        <cbc:IdentificationCode
          listAgencyID="6"
          listAgencyName="United Nations Economic Commission for Europe"
          listSchemeURI="urn:oasis:names:specification:ubl:codelist:gc:CountryIdentificationCode-2.1"
        >CO</cbc:IdentificationCode>

Secondly, the title is still user friendly, for display purposes.

Screenshot

image

Thirdly, as per the Genericode XML Schema, CanonicalVersionUri is the primary key of a list, so we would have to go out of our way to work around that. https://docs.oasis-open.org/codelist/genericode/v1.0/os/xsd/genericode.xsd

XML Reference
<xsd:complexType name="Identification">
<xsd:annotation>
<xsd:documentation>Identification and location information (metadata).</xsd:documentation>
</xsd:annotation>
<xsd:sequence>
<xsd:group ref="gc:NameSet">
...
</xsd:group>
<xsd:element name="Version" type="xsd:token">
...
</xsd:element>
<xsd:element name="CanonicalUri" type="xsd:anyURI">
<xsd:annotation>
<xsd:documentation>Canonical URI which uniquely identifies all versions (collectively).</xsd:documentation>
<xsd:documentation>
<rule:text id="R25" category="document">Must be an absolute URI, must not be relative.</rule:text>
</xsd:documentation>
<xsd:documentation>
<rule:text id="R26" category="application">Must not be used as a de facto location URI.</rule:text>
</xsd:documentation>
</xsd:annotation>
</xsd:element>
<xsd:group ref="gc:VersionLocationUriSet">
...
</xsd:group>
<xsd:element name="Agency" minOccurs="0" type="gc:Agency">
...
</xsd:element>
</xsd:sequence>
</xsd:complexType>

...

<xsd:group name="VersionLocationUriSet">
<xsd:annotation>
<xsd:documentation>Identification and location URIs for a version.</xsd:documentation>
</xsd:annotation>
<xsd:sequence>
<xsd:element name="CanonicalVersionUri" type="xsd:anyURI">
...
</xsd:element>
<xsd:element name="LocationUri" minOccurs="0" maxOccurs="unbounded" type="xsd:anyURI">
...
</xsd:element>
<xsd:element name="AlternateFormatLocationUri" minOccurs="0" maxOccurs="unbounded" type="gc:MimeTypedUri">
...
</xsd:element>
</xsd:sequence>
</xsd:group>

Note, that CanonicalUri will be shared between lists, and within VerstionLocationUriSet (the unique identifying set), CanoncialVersionUri is the only mandatory one.

Edit:

I think this is what you where looking for:

image

blaggacao
blaggacao previously approved these changes Sep 30, 2024
Copy link
Collaborator

@blaggacao blaggacao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have contributed the Genericode Importer which triggered a certain amount of polishing and I think this version is ready to merge, now.

We should keep in mind, that we're actively developing the EDI functionality for EU & Colombia in an iterative approach against develop - so more to come, soon.

@blaggacao blaggacao marked this pull request as ready for review September 30, 2024 16:36
@barredterra
Copy link
Collaborator Author

barredterra commented Sep 30, 2024

If we have to use this very technical name, we should at least offer the import from the list view (or from an unsaved form) instead of from the saved form view. It would be kind of weird if the user first needs to manually extract some values from the XML, in order to then import the rest of it.

My approach was originally targeted at, e.g., being able to tell the users to create a Code List called "UNTDID.5305" and make sure it contains the tax codes they care about. The alternative, equally good or even better, would be being able to tell them to just import this code list. Telling them to extract the CanonicalVersionUri does not seem realistic, however.

@blaggacao
Copy link
Collaborator

blaggacao commented Oct 1, 2024

[ ... ] we should at least offer the import from the list view (or from an unsaved form) instead of from the saved form view. It would be kind of weird if the user first needs to manually extract some values from the XML, in order to then import the rest of it.

Telling them to extract the CanonicalVersionUri does not seem realistic, however.

Indeed! — I think this concern is already gracefully implemented?

General Scenario: Add more codes from the same list (e.g. different filters) — form import

Specific Scenario: I accidentially selected the wrong code list for upload

image


General Scenario: Add a new code list and some codes — list import

Implemented on the list views of Code List and Common Code

Specific Scenario: Step Two (immediately after upload); Selection of Import filters

image
Note: no extraction of CanonicalVersionUri by the user; filterable are fields with cardinality <= 5

@blaggacao
Copy link
Collaborator

d540d03

image

@blaggacao
Copy link
Collaborator

blaggacao commented Oct 2, 2024

d68d808

Fix UX of Title-less common codes in link dropdowns:

image

@blaggacao
Copy link
Collaborator

blaggacao commented Oct 2, 2024

Conclusion: We need to incubate a core feature "LazyRecords", which lazily traverse link values on on access but eagerly build the traversal tree 3-4 levels deep for best autocompletion. Inside, we then can replace standard Link bindings with our custom bindings. Whereby barren key renders the stringer of the target class in jinja (typically: name or title(?)). 🤣

We don't have multiple imports yet, so we can save one click and improve dicoverability.
- Allow re-import of the same code list to update existing data
- Fill all Code List data on first creation, not just the name
- Move import of Common Codes to common_code.py
- Add type hints
- Fix naming settings
- Move static utils out of class
- Remove unused import
- Remove ignore_permission=True
@barredterra
Copy link
Collaborator Author

barredterra commented Oct 19, 2024

#43425 (comment)

I broke this again, sorry.

Is this really valid? What use is a code if it can have multiple meanings, even in the same list?

@blaggacao
Copy link
Collaborator

blaggacao commented Oct 19, 2024

Is this really valid?

Yes.

What use is a code if it can have multiple meanings, even in the same list?

Not sure if we productively can second-guess the spec — especially since I have a real codelist which exhibits this behaviour.

Remember that code is not guaranteed to be a row prim key (there isn't even a guarantee that a row prim key is present), so we can't fake it and make it a requirement in our implementation.

Copy link
Collaborator

@blaggacao blaggacao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the better codelist handling — that looks a lot cleaner.

We'd just need to mitigate the two regressions:

  • Insertion speed for Codes (removal of raw bulk insert) — I think this can be still achived after the clarifying refactorings by operating on a generator / list rather than on the individual record object
  • Unique naming — we should aim to fix the silent bug in the previous implementation, as well

I broke this again, sorry.

Maybe I overlooked. But I'm not even quite sure if you broke it in principle 😄 or just accidentially via removing the idx during import. 😄

common_code.code_list = code_list

common_code.from_genericode(column_map, xml_element)
common_code.save()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunately prohibitively slow 😢 — we should revert back to the bulk insert.

I don't quite see a use case to update a code list item: they don't change on the same version. If they change they change on a different version which is a different code list.

Copy link
Collaborator Author

@barredterra barredterra Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prohibitively slow? I could't even blink while this imported my test data. How big are your code lists and how many do you want to import?

To be a bit more precise, on my machine this has a throughput of >200 records per second. And it needs to be done only once.

Copy link
Collaborator

@blaggacao blaggacao Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prohibitively slow?

It was at least pretty annoying during local tests, so much so that I took care to make it faster (not by accident or not knowing better). Now you took the same care to make it slower (for other reasons not entirely evident to even though I tried to infer some), so I'll have to assume it's going to be "pretty annoying during local tests" again.

Maybe you want to first briefly explain these "other reasons"?

Copy link
Collaborator Author

@barredterra barredterra Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually avoid bulk_insert. Yes, it is fast, but the reason for that is that it skips all validations and hooks, making it error prone and hard to customize. I would rather wait a couple seconds instead of skipping all that good stuff. Sure, when a task would take many minutes or hours, we cannot avoid bulk_insert. But for saving a couple of seconds, I think it's not worth the (potential) trouble.

Copy link
Collaborator Author

@barredterra barredterra Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now added a progress bar, so we don't have to stare at a blank screen during longer imports.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks. That makes sense, in principle.

Do we realistically expect the need for validations and hooks on mass import of Common Codes in practice, though? — I have a bit of a hard time to come to terms with the ask and presumption to deprioritize speed on abstract principle, alone.

The scenario here simply seems a bit far fetched to me.

The progress bar is a good idea to mitigate the problem. At least people wouldn't try to reload the page because it goes unresponsive.

It's been a while and I can't remember exactly how long it took. Granted, "prohibitively", indeed says as much about my personal tolerance level for slowness. 😊 Let me check it again report back, and when it takes more than 5 seconds, we revert back?

erpnext/edi/doctype/common_code/common_code.py Outdated Show resolved Hide resolved
Return a tuple of results instead
@barredterra barredterra removed their assignment Oct 23, 2024
@barredterra
Copy link
Collaborator Author

For testing, here's a list of code lists required for eInvoicing in Germany (XRechnung):

Comment on lines +38 to +39
def validate_distinct_references(self):
"""Ensure no two Common Codes of the same Code List are linked to the same document."""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For generating messages, it is very useful to have exactly one common code per ERPNext entity. However, for parsing messages, it would be useful to map multiple common codes to the same ERPNext entity. This is currently not possible due to this validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants