Skip to content

A fullstack web app that analyzes the content of the robots.txt files of given websites to check if they are compatible with AI bots

Notifications You must be signed in to change notification settings

sghmk12/robots-txt-ai-bot-compatibility-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Robots.txt Analyzer

A full-stack application that analyzes websites' robots.txt files to check if they are optimized for AI crawlers. The application provides detailed insights, recommendations, and a diff view of suggested changes.

Features

  • Analyze any website's robots.txt file
  • Check AI crawler access permissions
  • Generate optimization recommendations
  • Provide diff view of suggested changes
  • Modern, responsive UI
  • Real-time analysis

Application Screenshots

Home Page

Application Home Page

Analysis Results

Optimized Website

Good URL Analysis

Non-Optimized Website

Bad URL Analysis

Tech Stack

Backend

  • Python 3.11+
  • FastAPI
  • httpx for async HTTP requests
  • Pydantic for data validation

Frontend

Getting Started

Prerequisites

  • Python 3.11 or higher
  • Node.js 18 or higher
  • npm or yarn

Backend Setup

  1. Navigate to the backend directory:

    cd backend
  2. Create and activate a virtual environment:

    python -m venv venv
    source venv/bin/activate  # On Windows: `venv\Scripts\activate`
  3. Install dependencies:

    pip install -r requirements.txt
  4. Start the FastAPI server:

    fastapi dev app/main.py

The backend API will be available at http://localhost:8000

Frontend Setup

  1. Navigate to the frontend directory:

    cd frontend
  2. Install dependencies:

You will see an error after running npm install, simply run the second command listed below in order to fix the error

npm install
npm install react-diff-viewer-continued --legacy-peer-deps
  1. Copy the .env.example file into a .env file

  2. Start the development server:

    npm run dev

The frontend will be available at http://localhost:3000

API Documentation

The API provides the following endpoint:

POST /api/analyze-robots

Analyzes a website's robots.txt file for AI crawler optimization.

Request Body:

{
  "url": "example.com"
}

Response:

{
  "isOptimized": boolean,
  "rawContent": string,
  "aiBotAccess": {
    "allowed": boolean,
    "details": string
  },
  "diff": {
    "originalContent": string,
    "adjustedContent": string,
    "summary": string,
    "recommendations": string[]
  }
}

Error Handling

The application handles various error cases:

  • Invalid URLs
  • Failed robots.txt fetches
  • Forbidden access (403)
  • Redirect loops
  • Parsing errors

Testing

Backend Tests

Run the backend tests:

cd backend
pytest

Frontend Tests

Run the frontend tests:

cd frontend
npm test

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A fullstack web app that analyzes the content of the robots.txt files of given websites to check if they are compatible with AI bots

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published