[Bug]: LLMExtractionStrategy ratelimit results in no attribute usage

### crawl4ai version

0.5.0.post8

### Expected Behavior

If rate limit is hit the user should be informed

### Current Behavior

When the rate limit exceeds the retries `perform_completion_with_backoff` returns a list which is not handled by LLMExtractionStrategy.extract resulting in it trying to access usage data field which doesn't exist which results in the extracted_content being:
```json
[
    {
        "index": 0,
        "error": true,
        "tags": [
            "error"
        ],
        "content": "\'list\' object has no attribute \'usage\'"
    }
]
```

### Is this reproducible?

Yes

### Inputs Causing the Bug

```bash
Any request which results in a ratelimit for more than 2 retries.
```

### Steps to Reproduce

```bash
Perform an crawl using LLMExtractionStrategy.
```

### Code snippets

```python
"""Test LLM extraction strategy for job postings."""

import json
import logging
import os
import sys
from typing import TYPE_CHECKING, Any

from crawl4ai import AsyncWebCrawler, BrowserConfig, CacheMode, CrawlerRunConfig
from crawl4ai.async_configs import LLMConfig
from crawl4ai.extraction_strategy import LLMExtractionStrategy
from pydantic import BaseModel, Field
import pytest

if TYPE_CHECKING:
    from crawl4ai.models import CrawlResult

_LOGGER: logging.Logger = logging.getLogger(__name__)


class JobRequirement(BaseModel):
    """Schema for job requirements."""

    category: str = Field(
        description="Category of the requirement (e.g., Technical, Soft Skills)",
    )
    items: list[str] = Field(
        description="List of specific requirements in this category",
    )
    priority: str = Field(
        description="Priority level (Required/Preferred) based on the HTML class or context",
    )


class JobPosting(BaseModel):
    """Schema for job postings."""

    title: str = Field(description="Job title")
    department: str = Field(description="Department or team")
    location: str = Field(description="Job location, including remote options")
    salary_range: str | None = Field(description="Salary range if specified")
    requirements: list[JobRequirement] = Field(
        description="Categorized job requirements",
    )
    application_deadline: str | None = Field(
        description="Application deadline if specified",
    )
    contact_info: dict | None = Field(
        description="Contact information from footer or contact section",
    )


@pytest.mark.asyncio
async def test_llm_extraction() -> None:
    """Crawl job postings and extract details."""
    api_key: str | None = os.environ.get("OPENAI_API_KEY")
    if not api_key:
        msg: str = "OPENAI_API_KEY environment variable not set"
        raise ValueError(msg)

    browser_config: BrowserConfig = BrowserConfig(
        verbose=False,
        extra_args=[
            "--disable-gpu",
            "--disable-dev-shm-usage",
            "--no-sandbox",
        ],
    )

    extraction_strategy: LLMExtractionStrategy = LLMExtractionStrategy(
        llm_config=LLMConfig(
            provider="openai/gpt-4o",
            api_token=api_key,
        ),
        schema=JobPosting.model_json_schema(),
        extraction_type="schema",
        instruction="""
        Extract job posting details, using HTML structure to:
        1. Identify requirement priorities from CSS classes (e.g., 'required' vs 'preferred')
        2. Extract contact info from the page footer or dedicated contact section
        3. Parse salary information from specially formatted elements
        4. Determine application deadline from timestamp or date elements

        Use HTML attributes and classes to enhance extraction accuracy.
        """,
        input_format="html",
        # chunk_token_threshold=chunk_token_threshold,
    )

    config: CrawlerRunConfig = CrawlerRunConfig(
        cache_mode=CacheMode.BYPASS,
        stream=True,
        extraction_strategy=extraction_strategy,
    )

    async with AsyncWebCrawler(config=browser_config) as crawler:
        result: CrawlResult
        async for result in await crawler.arun_many(
            urls=[
                "https://www.rocketscience.gg/careers/c77fbdec-fce6-44f1-a05e-8cd76325a1a0/",
            ],
            config=config,
        ):
            assert result.success
            assert result.extracted_content
            extracted_content: list[dict[str, Any]] = json.loads(result.extracted_content)
            assert len(extracted_content) == 1


if __name__ == "__main__":
    import subprocess

    sys.exit(subprocess.call(["pytest", *sys.argv[1:], sys.argv[0]]))  # noqa: S603, S607
```

### OS

macOS

### Python version

3.12.9

### Browser

Chrome

### Browser version

_No response_

### Error logs & Screenshots (if applicable)

```log
platform darwin -- Python 3.12.9, pytest-8.3.5, pluggy-1.5.0
rootdir: xxx
configfile: pyproject.toml
plugins: anyio-4.9.0, logfire-3.12.0, pytest_httpserver-1.1.3, asyncio-0.26.0, mock-3.14.0
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=function, asyncio_default_test_loop_scope=function
collected 1 item

tests/test_extraction.py [FETCH]... ↓ http://localhost:51368/engineering-manager... | Status: True | Time: 0.74s
[SCRAPE].. ◆ http://localhost:51368/engineering-manager... | Time: 0.096s

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.

Rate limit error: litellm.RateLimitError: RateLimitError: OpenAIException - Request too large for gpt-4o in organization org-XXX on tokens per min (TPM): Limit 30000, Requested 75303. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.
Waiting for 2 seconds before retrying...

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.

Rate limit error: litellm.RateLimitError: RateLimitError: OpenAIException - Request too large for gpt-4o in organization org-XXX on tokens per min (TPM): Limit 30000, Requested 75303. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.
Waiting for 4 seconds before retrying...

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.

Rate limit error: litellm.RateLimitError: RateLimitError: OpenAIException - Request too large for gpt-4o in organization org-XXX on tokens per min (TPM): Limit 30000, Requested 75303. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.
[EXTRACT]. ■ Completed for http://localhost:51368/engineering-manager... | Time: 14.378983207978308s
[COMPLETE] ● http://localhost:51368/engineering-manager... | Status: True | Total: 15.22s

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: LLMExtractionStrategy ratelimit results in no attribute usage #989

crawl4ai version

Expected Behavior

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: LLMExtractionStrategy ratelimit results in no attribute usage #989

Description

crawl4ai version

Expected Behavior

Current Behavior

Is this reproducible?

Inputs Causing the Bug

Steps to Reproduce

Code snippets

OS

Python version

Browser

Browser version

Error logs & Screenshots (if applicable)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions