Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResourceNotificationService.WaitForResourceAsync is unreliable for Postgres resource #5469

Closed
1 task done
impworks opened this issue Aug 28, 2024 · 5 comments · Fixed by #5867
Closed
1 task done
Assignees
Labels
area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication
Milestone

Comments

@impworks
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

Calling await ResourceNotificationService.WaitForResourceAsync("postgres-db", KnownResourceStates.Running) returns too early and results in cryptic exceptions when trying to actually connect to the database.

Expected Behavior

The database should be fully accessible when this method has returned.

Steps To Reproduce

This occurs when I am writing an integration test and need to do a database migration. Consider the following code:

AppHost/Program.cs

var builder = DistributedApplication.CreateBuilder(args);
var pgsql = builder.AddPostgres("postgres");
var db = pgsql.AddDatabase("mydb");
var api = builder.AddProject<Projects.MyApi>("myapi").WithReference(db);

MyTests.cs

public class MyTests
{
    [Fact]
    public async Task MyApiTest()
    {
        var appHost = await DistributedApplicationTestingBuilder.CreateAsync<Projects.AspireHost>();
        appHost.Services.ConfigureHttpClientDefaults(clientBuilder => { clientBuilder.AddStandardResilienceHandler(); });
        var app = await appHost.BuildAsync();
        await app.StartAsync();

        // waiting for all resources
        var rns = app.Services.GetRequiredService<ResourceNotificationService>();
        await rns.WaitForResourceAsync("mydb", KnownResourceStates.Running).WaitAsync(TimeSpan.FromSeconds(30));
        await rns.WaitForResourceAsync("myapi", KnownResourceStates.Running).WaitAsync(TimeSpan.FromSeconds(30));

        // ensure database is in actual state
        var cstr = await app.GetConnectionStringAsync("mydb")
        var opts = new DbContextOptionsBuilder<MyDbContext>().UseNpgsql(cstr).UseSnakeCaseNamingConvention().Options;
        using(var ctx = new MyDbContext(opts))
            await ctx.Database.MigrateAsync();      // <<< the exception happens here

        // test the API
        var httpClient = app.CreateHttpClient("myapi");
        // ...
    }
}

The MyDbContext is a EF Core DbContext.

An exception pops up when calling ctx.Database.MigrateAsync() (see below).

Key points are:

  • The test can succeed randomly, maybe 1 or 2 times out of 10
  • The test always succeeds if I put a Task.Delay(1000) before creating the context or wait on a breakpoint
  • The issue does not reproduce in actual code, e.g. when the host is being run via F5 in VS and the API is set to migrate the database on startup

Exceptions (if any)

Npgsql.NpgsqlException
Exception while reading from stream
   at Npgsql.Internal.NpgsqlReadBuffer.<Ensure>g__EnsureLong|55_0(NpgsqlReadBuffer buffer, Int32 count, Boolean async, Boolean readingNotifications)
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource.GetResult(Int16 token)
   at Npgsql.Internal.NpgsqlConnector.RawOpen(SslMode sslMode, NpgsqlTimeout timeout, Boolean async, CancellationToken cancellationToken, Boolean isFirstAttempt)
   at Npgsql.Internal.NpgsqlConnector.<Open>g__OpenCore|213_1(NpgsqlConnector conn, SslMode sslMode, NpgsqlTimeout timeout, Boolean async, CancellationToken cancellationToken, Boolean isFirstAttempt)
   at Npgsql.Internal.NpgsqlConnector.Open(NpgsqlTimeout timeout, Boolean async, CancellationToken cancellationToken)
   at Npgsql.UnpooledDataSource.Get(NpgsqlConnection conn, NpgsqlTimeout timeout, Boolean async, CancellationToken cancellationToken)
   at Npgsql.NpgsqlConnection.<Open>g__OpenAsync|42_0(Boolean async, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenInternalAsync(Boolean errorsExpected, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenInternalAsync(Boolean errorsExpected, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenAsync(CancellationToken cancellationToken, Boolean errorsExpected)
   at Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.NpgsqlDatabaseCreator.Exists(Boolean async, CancellationToken cancellationToken)
   at Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.NpgsqlDatabaseCreator.Exists(Boolean async, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Migrations.HistoryRepository.ExistsAsync(CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Migrations.HistoryRepository.GetAppliedMigrationsAsync(CancellationToken cancellationToken)
   at Npgsql.EntityFrameworkCore.PostgreSQL.Migrations.Internal.NpgsqlMigrator.MigrateAsync(String targetMigration, CancellationToken cancellationToken)

System.IO.EndOfStreamException
Attempted to read past the end of the stream.
   at Npgsql.Internal.NpgsqlReadBuffer.<Ensure>g__EnsureLong|55_0(NpgsqlReadBuffer buffer, Int32 count, Boolean async, Boolean readingNotifications)

.NET Version info

.NET SDK:
 Version:           8.0.303
 Commit:            29ab8e3268
 Workload version:  8.0.300-manifests.56cd0383
 MSBuild version:   17.10.4+10fbfbf2e

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.19045
 OS Platform: Windows
 RID:         win-x64
 Base Path:   C:\Program Files\dotnet\sdk\8.0.303\

.NET workloads installed:
 [aspire]
   Installation Source: SDK 8.0.300, VS 17.10.35122.118
   Manifest Version:    8.1.0/8.0.100
   Manifest Path:       C:\Program Files\dotnet\sdk-manifests\8.0.100\microsoft.net.sdk.aspire\8.1.0\WorkloadManifest.json
   Install Type:        FileBased


Host:
  Version:      8.0.7
  Architecture: x64
  Commit:       2aade6beb0

.NET SDKs installed:
  5.0.408 [C:\Program Files\dotnet\sdk]
  6.0.425 [C:\Program Files\dotnet\sdk]
  8.0.303 [C:\Program Files\dotnet\sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 6.0.32 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 6.0.33 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 7.0.20 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 8.0.7 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 6.0.32 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 6.0.33 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 7.0.20 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 8.0.7 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.WindowsDesktop.App 5.0.17 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 6.0.32 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 6.0.33 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 7.0.20 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
  Microsoft.WindowsDesktop.App 8.0.7 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]

Other architectures found:
  x86   [C:\Program Files (x86)\dotnet]
    registered at [HKLM\SOFTWARE\dotnet\Setup\InstalledVersions\x86\InstallLocation]

Environment variables:
  Not set

global.json file:
  Not found

Anything else?

Output for dotnet workload list:

Installed Workload Id      Manifest Version      Installation Source
--------------------------------------------------------------------------------
aspire                     8.1.0/8.0.100         SDK 8.0.300, VS 17.10.35122.118

The packages in the solution are v8.1.0.

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 28, 2024
@davidfowl
Copy link
Member

It's not unreliable but there's no state that tells you when the resource is healthy vs when it is running. This is something we're going to address in aspire 9 with health checks.

@davidfowl
Copy link
Member

cc @mitchdenny

@davidfowl
Copy link
Member

See #5275

@davidfowl davidfowl reopened this Sep 7, 2024
@davidfowl
Copy link
Member

@mitchdenny We may need a helper method on top of RNS for WaitForHealthy

@davidfowl davidfowl added area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication enhancement and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Sep 7, 2024
@mitchdenny
Copy link
Member

@mitchdenny We may need a helper method on top of RNS for WaitForHealthy

Yep. It'll come in handy in a few places.

@mitchdenny mitchdenny added this to the 9.0 milestone Sep 8, 2024
@mitchdenny mitchdenny removed their assignment Sep 9, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Oct 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-app-model Issues pertaining to the APIs in Aspire.Hosting, e.g. DistributedApplication
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants