Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Flaky test_cross_version_persist.py test #2947

Open
tazarov opened this issue Oct 13, 2024 · 0 comments
Open

[Bug]: Flaky test_cross_version_persist.py test #2947

tazarov opened this issue Oct 13, 2024 · 0 comments
Labels
bug Something isn't working by-chroma

Comments

@tazarov
Copy link
Contributor

tazarov commented Oct 13, 2024

What happened?

Python tests / test (3.8, depot-ubuntu-22.04, chromadb/test/property/test_cross_version_persist.py)

https://github.com/chroma-core/chroma/actions/runs/11315485759/job/31466731225?pr=2946#logs - Failed
https://github.com/chroma-core/chroma/actions/runs/11315485759/job/31468326017?pr=2946#logs - Sucess

Versions

main

Relevant log output

def log_size_below_max(
        system: System, collections: List[Collection], has_collection_mutated: bool
    ) -> None:
        sqlite = system.instance(SqliteDB)
    
        if has_collection_mutated:
            # Must always keep one entry to avoid reusing seq_ids
            assert _total_embedding_queue_log_size(sqlite) >= 1
    
            # We purge per-collection as the sync_threshold is a per-collection setting
            sync_threshold_sum = sum(
                collection.metadata.get("hnsw:sync_threshold", 1000)
                if collection.metadata is not None
                else 1000
                for collection in collections
            )
            batch_size_sum = sum(
                collection.metadata.get("hnsw:batch_size", 100)
                if collection.metadata is not None
                else 100
                for collection in collections
            )
    
            # -1 is used because the queue is always at least 1 entry long, so deletion stops before the max ack'ed sequence ID.
            # And if the batch_size != sync_threshold, the queue can have up to batch_size more entries.
>           assert (
                _total_embedding_queue_log_size(sqlite) - 1
                <= sync_threshold_sum + batch_size_sum
            )
E           AssertionError
E           Falsifying example: test_cycle_versions(
E               version_settings=('0.5.13',
E                Settings(environment='', chroma_api_impl='chromadb.api.segment.SegmentAPI', chroma_server_nofile=None, chroma_server_thread_pool_size=40, tenant_id='default', topic_namespace='default', chroma_server_host=None, chroma_server_headers=None, chroma_server_http_port=None, chroma_server_ssl_enabled=False, chroma_server_ssl_verify=None, chroma_server_api_default_path='/api/v1', chroma_server_cors_allow_origins=[], is_persistent=True, persist_directory='/tmp/persistence_test_chromadb', chroma_memory_limit_bytes=0, chroma_segment_cache_policy=None, allow_reset=True, chroma_auth_token_transport_header=None, chroma_client_auth_provider=None, chroma_client_auth_credentials=None, chroma_server_auth_ignore_paths={'/api/v1': ['GET'], '/api/v1/heartbeat': ['GET'], '/api/v1/version': ['GET']}, chroma_overwrite_singleton_tenant_database_access_from_auth=False, chroma_server_authn_provider=None, chroma_server_authn_credentials=None, chroma_server_authn_credentials_file=None, chroma_server_authz_provider=None, chroma_server_authz_config=None, chroma_server_authz_config_file=None, chroma_product_telemetry_impl='chromadb.telemetry.product.posthog.Posthog', chroma_telemetry_impl='chromadb.telemetry.product.posthog.Posthog', anonymized_telemetry=True, chroma_otel_collection_endpoint='', chroma_otel_service_name='chromadb', chroma_otel_collection_headers={}, chroma_otel_granularity=None, migrations='apply', migrations_hash_algorithm='md5', chroma_segment_directory_impl='chromadb.segment.impl.distributed.segment_directory.RendezvousHashSegmentDirectory', chroma_memberlist_provider_impl='chromadb.segment.impl.distributed.segment_directory.CustomResourceMemberlistProvider', worker_memberlist_name='query-service-memberlist', chroma_server_grpc_port=None, chroma_sysdb_impl='chromadb.db.impl.sqlite.SqliteDB', chroma_producer_impl='chromadb.db.impl.sqlite.SqliteDB', chroma_consumer_impl='chromadb.db.impl.sqlite.SqliteDB', chroma_segment_manager_impl='chromadb.segment.impl.manager.local.LocalSegmentManager', chroma_quota_provider_impl=None, chroma_rate_limiting_provider_impl=None, chroma_rate_limit_enforcer_impl='chromadb.rate_limit.simple_rate_limit.SimpleRateLimitEnforcer', chroma_logservice_request_timeout_seconds=3, chroma_sysdb_request_timeout_seconds=3, chroma_query_request_timeout_seconds=60, chroma_db_impl=None, chroma_collection_assignment_policy_impl='chromadb.ingest.impl.simple_policy.SimpleAssignmentPolicy', chroma_coordinator_host='localhost', chroma_logservice_host='localhost', chroma_logservice_port=50052)),
E               collection_strategy=Collection(name='Ugn', metadata={'hnsw:construction_ef': 128, 'hnsw:search_ef': 128, 'hnsw:M': 128, 'hnsw:sync_threshold': 10, 'hnsw:batch_size': 8}, embedding_function=hashing_embedding_function(dim=1487, dtype=<class 'numpy.float32'>), id=UUID('1496cc63-f70c-[43](https://github.com/chroma-core/chroma/actions/runs/11315485759/job/31466731225?pr=2946#step:4:44)8a-8ed2-c83376eab4e3'), dimension=1487, dtype=<class 'numpy.float32'>, known_metadata_keys={}, known_document_keywords=[], has_documents=False, has_embeddings=True),
@tazarov tazarov added the bug Something isn't working label Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working by-chroma
Projects
None yet
Development

No branches or pull requests

2 participants