Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎯 Python Experience Improvements #1919

Open
2 of 5 tasks
RomainMuller opened this issue Aug 19, 2020 · 29 comments
Open
2 of 5 tasks

🎯 Python Experience Improvements #1919

RomainMuller opened this issue Aug 19, 2020 · 29 comments
Labels
effort/large Large work item – several weeks of effort feature-request A feature should be added or improved. language/python Related to Python bindings management/tracking Issues that track a subject or multiple issues p1

Comments

@RomainMuller
Copy link
Contributor

RomainMuller commented Aug 19, 2020

Preamble

This is a tracking issue summarizing the existing pain points in Python front-ends generated by jsii-pacmak. Members of the community are more than welcome to contribute to this, for example by sharing their pet peeves in the communication thread below.

The initial post will be periodically updated to include a summary of the items that are accepted as needed improvements to our Python offering.

Known Pain Points

  • Instructions in Contributing.md for pulling the superchain docker image don't work aws-cdk#474 - Incomplete CDK documentation due to quirks in sphinx
    • Nested types are missing
    • Many parameters are showing typed as Forwardref
  • Incorrectly transliterated sample code in the cdk documentation
    • Type information is necessary in many cases, and jsii-rosetta does not require that examples compile
    • In many cases, it's only a matter of adding a rosetta fixture
  • Un-pythonic reliance on a custom metaclass
    • Metaclasses complicate extension: all sub-classes in a class family must eventually derive from the same metaclass
    • Python doesn't really have interfaces, the idiom is using Protocols
      • Protocols can be implicitly implemented (structural typing)
      • Explicit implementation uses the extension syntax, but this cannot work due to metaclass dissonance
  • Passing dicts in lieu of jsii structs does not consistently work
    • Case adjustments of values is not always consistently done, resulting in runtime errors or worse (silently ignored properties, incorrect/unexpected side-effects, ...)
  • Type-checking errors from @jsii/kernel are obscure

Discussion

Incomplete CDK Documentation

The AWS CDK API reference documentation is currently generated using a private tool, which leverages the idiomatic documentation generator for each supported language (docfx, javadoc, sphinx, ...). While this has the advantage of giving users the same documentation experience that they're used to for their language of choice, it has several disadvantages:

  • it is an awkward experience for users building polyglot applications, since each language's documentation set is entirely separate from the others
  • we have to suffer the limitations of the idiomatic tools
    • customizing templates is often a difficult process
    • certain jsii features are broken or weird in some languages (i.e: sphinx and nested types)

Additionally, the current generator being private means other users of jsii are not able to produce a documentation site that looks and feels like the AWS CDK reference. One of the goals of jsii being that developers need not learn all about the standard tools of front-end languages, providing a jsiidoc tool would greatly reduce friction around documenting polyglot APIs. This is however a significant project, and shorter-term solutions (such as using autodoc for Python with generated .rst files).

Incorrectly Transliterated Sample Code

The jsii-rosetta tool is able to collect code example snippets from various locations in the jsii assemblies, and attempts to automatically transliterate those to other front-end languages. It delivers the best fidelity when it is able to obtain complete type information around the example (i.e: it is valid and compiling TypeScript code); and must resort to broad assumptions otherwise.

In the context of AWS CDK, many example snippets lack context information necessary in order to obtain complete type information, which often leads to incorrectly transliterated code. In the Python context, this can mean the transliterated code uses keyword arguments at a location where they are not actually supported.

There is a --fail option to jsii-rosetta which will cause it to abort if an example is not compiling (the corollary of this being there cannot be sufficient type information). Enabling this behavior has two benefits:

  • It guarantees example code is up-to-date with the current library's API
  • Complete type information should be available, enabling the best possible transliteration result

This feature is however not enabled by default on the AWS CDK, as many examples there do not have the full context needed for them to compile. In many cases, all that is needed is the introduction of fixtures that provide the necessary context while not weighting example code shown in the documentations with repetitive boilerplate.

Un-pythonic reliance on a custom metaclass

The jsii runtime for Python provides a JSIIMeta metaclass that is used by all generated bindings for classes (classes from the jsii assembly). As a consequence, attempting to leverage Python's multiple inheritance (for example, to implement an interface explicitly) may lead to a runtime error, as all bases of a Python class must be rooted on the same metaclass.

Additionally, the JSIIMeta metaclass is merely used to record jsii-specific metadata on generated classes; but since those are code-generated, it would be trivial to register the exact same metadata as part of the generated declarations, rendering the metaclass useless.

Not relying on the metaclass would mean users are able to implement interfaces explicitly in the idiomatic way, which would provide better information to static analyzers such as mypy, to help ensure developers have correctly implemented an interface, and are correctly passing values that conform to interfaces to method parameters.

# The slightly awkward:
@jsii.implements(lib.IInterface)
class Implementor():
    ...

# Could become:
class Implementor(lib.IInterface):
    ...

Passing dict in lieu of jsii structs does not consistently work

Similar to the Javascript idioms, Python developers will often intuitively use dict literals in places where the API expects some instance of a jsii struct. The generated Python code has specific code paths to convert a dict to the correct struct instance, however there seem to be many cases where this conversion is not happening, leading to:

  • type-checking errors in the @jsii/kernel module
  • property keys not being translated from snake_case to camelCase
  • properties being dropped entirely by the @jsii/kernel

The workaround forces users to resort to relatively verbose boilerplate code, which in turns looks very un-pythonic.

# The somewhat heavy-handed
s3.Bucket(self, 'Gold', lifecycle_rules=[s3.LifecycleRule(expiration=cdk.Duration.days(15)])

# Could become:
s3.Bucket(self, 'Gold', lifecycle_rules=[{ 'expiration': cdk.Duration.days(15) }])

Type-checking errors from @jsii/kernel are obscure

When a user makes a programming mistake and passes the wrong kind of value to an API (be it a parameter of some method or constructor, or the value being set to some property), the errors generated by the @jsii/kernel are not very helpful to developers as:

  • They no not relate to the user's native language type names
  • They may lack sufficient context to be parseable by the user (for example, when the error occurs in a deeply-nested value)
@RomainMuller RomainMuller added language/python Related to Python bindings feature-request A feature should be added or improved. effort/large Large work item – several weeks of effort p1 management/tracking Issues that track a subject or multiple issues labels Aug 19, 2020
@RomainMuller RomainMuller self-assigned this Aug 19, 2020
@RomainMuller RomainMuller pinned this issue Aug 19, 2020
@rix0rrr
Copy link
Contributor

rix0rrr commented Aug 19, 2020

I know you're working on it already, but might want to put some form of "runtime type validation" in there as well. I saw a LOT of issues where users confused L1 and L2 classes.

@rix0rrr
Copy link
Contributor

rix0rrr commented Aug 19, 2020

Incomplete CDK Documentation

What you're writing about a better tool is correct, but is a big project. There's probably some effective short-term things we can do instead (for example: stop relying on autodoc for Python and just starting to generate RST docs ourselves).

Un-pythonic reliance on a custom metaclass

I'm missing some concrete examples here. Are you proposing to replace:

@jsii.implements(cdk.ISomething)
class Boo:
   ...

# with
class Boo(cdk.ISomething):
   ...

?

@RomainMuller
Copy link
Contributor Author

I'm missing some concrete examples here. Are you proposing to replace:

@jsii.implements(cdk.ISomething)
class Boo:
   ...

# with
class Boo(cdk.ISomething):
   ...

Yes, exactly that.

@richardhboyd
Copy link

Did you recently change the method signatures to include line-breaks? That was my biggest pet peeve but it seems to be fixed now.

@pepastach
Copy link

I'd love to have a way to unit test my stacks. The assert module only works with TypeScript AFAIK. We now have to use snapshot tests. Ie. generate cloudformation template and compare it with a known and validated one. Then if the test fails, we get two huge strings that don't match. It works, but it's not atomic.

@brainstorm
Copy link

Also fix the pip-installation: aws/aws-cdk#3406 (comment)

@richardhboyd
Copy link

@brainstorm can you elaborate a bit more on "fix"? Are you referring to one large package that includes all CDK modules instead of individual modules per service?

@brainstorm
Copy link

brainstorm commented Aug 20, 2020

Not instead, but in addition. pip install aws-cdk should pull all the aws-cdk-* submodules avoiding ImportErrors. You can as well just install the just modules you need by being explicit about the deps (i.e only S3 or Lambda CDK submodules).

Now, I know this could be seen as "bloat", but from a DX perspective, I reckon it's the better decision IMO?

@richardhboyd
Copy link

We're working on something similar with monocdk

@adamjkeller
Copy link

adamjkeller commented Aug 20, 2020

Adding a section to the documentation on unit testing with examples. It would be super slick to have this integrated with the cli, ie. cdk run-tests

@kristianpaul
Copy link

awkward experience when looking at docs indeed

@RomainMuller
Copy link
Contributor Author

Did you recently change the method signatures to include line-breaks? That was my biggest pet peeve but it seems to be fixed now.

We used to run generated Python code through black for a while. Reverted because it was too slow, but we're generating friendlier code now (lines ideally < 90 chars, but sometimes longer because we couldn't roll out a full fledged Python parser here).

@RomainMuller
Copy link
Contributor Author

Adding a section to the documentation on unit testing with examples. It would be super slick to have this integrated with the cli, ie. cdk run-tests

This is interesting but could be tricky (because we'd likely end up having to be opinionated with respects to test frameworks, and opinions usually don't get you a lot of new friends 🤣).

We've considered this as "living with the languages' idioms", so we'd expect you want to do it "like you normally test". That seems to go against the idea of integrating this in the cdk CLI, but well... if this is what the majority wants....

mergify bot pushed a commit that referenced this issue Sep 25, 2020
Improved the documentability of generated Python code by reducing the
use of forward references through keeping track of types that were
already emitted; and by emitting the kind of forward reference that `sphinx`'
`autodoc` extension is able to resolve (forward references within a sub-package
must be relative to that sub-package; and nested declarations can not
use their surrounding type's name in forward references (those must be relative
to the surrounding type's context).

Additionally, removed duplicated headings that were generated for
documentation segments (`see`, `returns`, ...), as those were causing
redundant content to appear on the generated docsite.

This completely removes `ForwardRef` instances from the Python
reference documentation (as was checked by issuing a recursive
`grep` on the generated documentation).

Fixes #474
Related #1919 

---

By submitting this pull request, I confirm that my contribution is made under the terms of the [Apache 2.0 license].

[Apache 2.0 license]: https://www.apache.org/licenses/LICENSE-2.0
@jarrodldavis
Copy link

I'd like to add some notes about my experience using CDK with type checking (specifically Pyright/Pylance in Visual Studio Code):

  • class properties (such as Runtime.PYTHON_3_8 from aws_cdk.aws_lambda) are not seen as proper class properties by type checkers. Instead, they are seen as normal functions, requiring a manual typing.cast() to use these values where needed
  • implementations of interfaces (such as IHttpRouteIntegration implemented by LambdaProxyIntegration in aws_cdk.aws_apigatewayv2_integrations) are not seen as true implementations, since the only indication of this relationship is through a custom decorator that the type checker doesn't know about. As a result, this also requires using typing.cast to use these values where needed

@RomainMuller
Copy link
Contributor Author

The interfaces should be declared as Protocols, which means that normally, if they are correctly implemented, structural typing should make them match... Although I'd say I have seen cases where mypy would not seem to be impressed. I'm not too sure how to address, especially as metaclasses are otherwise involved (and they complicate everything in that flow)

@gshpychka
Copy link

pyright is compaining about virtually all interfaces for me, is there a path to fixing this?

@dmartin-gh
Copy link

I am also experiencing the interface issues described above. We're looking to adopt the CDK for our main platform and as a mostly Python/C++ shop the Python CDK seemed like a great choice for us. I didn't make it very far into my first sample app to test things out before mypy started throwing errors.

from aws_cdk.core import Construct, Stack
from aws_cdk import aws_ec2 as ec2

class MyEC2Stack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        ec2.Instance(self, 'my-app',
            instance_type=ec2.InstanceType('m5.4xlarge'),
            machine_image=ec2.MachineImage.generic_linux({
                'us-east-1': 'ami-0affd4508a5d2481b', # CentOS 7 (x86_64) - with Updates HVM
            }),
            vpc=ec2.Vpc(self, "my-app-vpc"))

image

I would love to see this fixed so we can start building out our application stack with confidence that we're passing the right level of constructs around!

@polothy
Copy link
Contributor

polothy commented Jun 18, 2021

class properties (such as Runtime.PYTHON_3_8 from aws_cdk.aws_lambda) are not seen as proper class properties by type checkers. Instead, they are seen as normal functions, requiring a manual typing.cast() to use these values where needed

Seeing this as well in multiple IDEs. I swear I didn't see it before, but maybe it was just ignored by my IDE before. Code certainly works, just need the right @thing perhaps. I couldn't figure out which one.

@osulli
Copy link

osulli commented Sep 8, 2021

EDIT: Sorry I think this might actually be the wrong Issue I had bookmarked but I cannot for the life of me find the JSII / CDK issue discussing lack of intellisense and docstrings for CDK packages.

--

I wanted to offer a cool feature I found in PyCharm, period, but also could be used to generate something better.

In PyCharm:
SETTINGS -> TOOLS -> EXTERNAL DOCUMENTATION

Add a new source as below:

# Module Name
aws_cdk
# URL/Path Pattern
https://docs.aws.amazon.com/cdk/api/latest/python/{module.name}/{class.name}.html#{module.name}.{class.name}

image

Outcome

  • It leaves much to be desired however a built-in URL to docs is an interesting first step.
  • The docs can also be opened with SHIFT+F1 on Windows.

Speculation

  • Could this potentially lead to pointing to a local file which CAN be read, notably by WSL2 Python interpreter + PyCharm IDE?
  • Could a plugin be written to GET https://docs.aws.amazon.com/cdk/api/latest/python/aws_cdk.aws_ec2/ClientVpnEndpoint.html#aws_cdk.aws_ec2.ClientVpnEndpoint and output?

image

@CarlosDomingues
Copy link

I'm also having issues using Pyright and interfaces, same as @jarrodldavis described.

@RomainMuller RomainMuller unpinned this issue Jan 6, 2022
@rirze
Copy link

rirze commented Apr 15, 2022

Another quirk of the un-Pythonic metaclass requirement: invalid attributes (of a class that uses JSIIMeta as a metaclass) don't throw AttributeErrors. Instead they return quietly as None...

I suspect there's some clever way of overloading __getattr__ that is necessary for jsii vars, but doesn't handle the case of non-existent Python variables as expected.

@Jacco
Copy link

Jacco commented May 8, 2022

About passing dicts:

I am playing with an adjustment of https://github.com/aws/jsii/blob/main/packages/jsii-pacmak/lib/targets/python.ts

In case where name_mappings are present instead of only property getter also setter could be emitted.

These setters could then cast in case a dict is found and the init would use those setters.

This is the cast function:

def cast(klass, value):
    if value.__class__.__name__ == 'dict':
        return klass(**value)
    else:
        return value

For example CfnBucketProps from cdk:

In init:

        if logging_configuration is not None:
            self._values["logging_configuration"] = logging_configuration

becomes:

        if logging_configuration is not None:
            self.logging_configuration = logging_configuration

And a setter is added:

    @logging_configuration.setter
    def logging_configuration(
        self,
        value: typing.Optional[typing.Union["CfnBucket.LoggingConfigurationProperty", _IResolvable_da3f097b]],
    ) -> None:
        self._values["logging_configuration"] = cast(CfnBucket.LoggingConfigurationProperty, value)

This would allow those dicts :-) What do you think about this?

@RomainMuller
Copy link
Contributor Author

@Jacco one property of jsii structs is that they are pure-data, immutable objects. Adding setters to those would break the immutable property.

So basically, we'd need those setters to be private, if we want to go down that path... Otherwise, we'll have to carefully consider the semantic implications of allowing mutations here in violation of the jsii type system (it might be fine, but I wouldn't want to take that decision lightly)

@RomainMuller
Copy link
Contributor Author

@rirze

Another quirk of the un-Pythonic metaclass requirement: invalid attributes (of a class that uses JSIIMeta as a metaclass) don't throw AttributeErrors. Instead they return quietly as None...

Do you have an example of code that has this behavior? I'm a bit surprised here, as the metaclass doesn't directly override __getatt__, and our dynamic proxy objects (appear to) appropriately raise AttributeError when attempting to use a non-existing property...

There might be an edge-case I'm not seeing and I'd like to step through some code that behaves incorrectly to get a sense of what's going on...

@RomainMuller
Copy link
Contributor Author

@dmartin-gh

I am also experiencing the interface issues described above. We're looking to adopt the CDK for our main platform and as a mostly Python/C++ shop the Python CDK seemed like a great choice for us. I didn't make it very far into my first sample app to test things out before mypy started throwing errors.

from aws_cdk.core import Construct, Stack
from aws_cdk import aws_ec2 as ec2

class MyEC2Stack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        ec2.Instance(self, 'my-app',
            instance_type=ec2.InstanceType('m5.4xlarge'),
            machine_image=ec2.MachineImage.generic_linux({
                'us-east-1': 'ami-0affd4508a5d2481b', # CentOS 7 (x86_64) - with Updates HVM
            }),
            vpc=ec2.Vpc(self, "my-app-vpc"))

image

I would love to see this fixed so we can start building out our application stack with confidence that we're passing the right level of constructs around!

Is this still a problem today? I have been trying to reproduce this (with CDK v2, though), and am not getting the MyPy error you report here... if you still happen to have the error, maybe this is because of your specific MyMy version or configuration (in which case, I would like to know what these are so I can reproduce).

@gshpychka
Copy link

gshpychka commented Jun 29, 2022

@RomainMuller

Is this still a problem today?

It's much less of a problem after #2809, but it still exists due to #2927 / #4541

@spyoungtech
Copy link

spyoungtech commented Jan 4, 2023

It would also be nice if interfaces/protocols could be runtime checkable.

@jsii.implements(aws_cdk.IAspect)
class ConfigureConnections:
    def visit(self, node):
        # Raises error 
        # only protocols decorated with `@runtime_checkable` can be used for instance/subclass checks
        if not isinstance(node, ec2.IConnectable):
            return
        # ...
Aspects.of(scope).add(ConfigureConnections(my_resources))

Is there a workaround for this shortcoming other than manually implementing protocol checking?

@LS80
Copy link

LS80 commented Oct 12, 2023

Passing dict in lieu of jsii structs does not consistently work

Similar to the Javascript idioms, Python developers will often intuitively use dict literals in places where the API expects some instance of a jsii struct.

For me this is the one thing that stops cdk8s with Python being widely usable. Is there any hope that it will be fixed?

@mariogalic
Copy link

@dmartin-gh

I am also experiencing the interface issues described above. We're looking to adopt the CDK for our main platform and as a mostly Python/C++ shop the Python CDK seemed like a great choice for us. I didn't make it very far into my first sample app to test things out before mypy started throwing errors.

from aws_cdk.core import Construct, Stack
from aws_cdk import aws_ec2 as ec2

class MyEC2Stack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        ec2.Instance(self, 'my-app',
            instance_type=ec2.InstanceType('m5.4xlarge'),
            machine_image=ec2.MachineImage.generic_linux({
                'us-east-1': 'ami-0affd4508a5d2481b', # CentOS 7 (x86_64) - with Updates HVM
            }),
            vpc=ec2.Vpc(self, "my-app-vpc"))

image
I would love to see this fixed so we can start building out our application stack with confidence that we're passing the right level of constructs around!

Is this still a problem today? I have been trying to reproduce this (with CDK v2, though), and am not getting the MyPy error you report here... if you still happen to have the error, maybe this is because of your specific MyMy version or configuration (in which case, I would like to know what these are so I can reproduce).

Indeed it is still a problem. For example try loading up apigw-http-api-lambda-dynamodb-python-cdk in vscode, then pyright reports for

        # Create the Lambda function to receive the request
        api_hanlder = lambda_.Function(
            ...
        )

        ...

        # Create API Gateway
        apigw_.LambdaRestApi(
           ...
            handler=api_hanlder,  # Pyright error: Argument of type "Function" cannot be assigned to parameter "handler" of type "IFunction"
        )

typecheck error

Argument of type "Function" cannot be assigned to parameter "handler" of type "IFunction" in function "__init__"
  "Function" is incompatible with protocol "IFunction"
    "IFunction" is incompatible with "Function"
    "IFunction" is incompatible with "Function"
    "IFunction" is incompatible with "Function"
    "IFunction" is incompatible with "Function"
    "IFunction" is incompatible with "Function"
    "IFunction" is incompatible with "Function"
    "IFunction" is incompatible with "Function"Pylance[reportArgumentType](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportArgumentType)
(variable) api_hanlder: Function

as shown in this screenshot

image

That project uses aws-cdk-lib==2.77.0 however the same issue exists after updating to recent aws-cdk-lib==2.143.1. On my machine it happens with the following configuration

pyright 1.1.364
pylance 2024.5.103
aws-cdk-lib 2.143.1
python 3.11.2
vscode 1.89.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort/large Large work item – several weeks of effort feature-request A feature should be added or improved. language/python Related to Python bindings management/tracking Issues that track a subject or multiple issues p1
Projects
None yet
Development

No branches or pull requests