An intelligent event discovery system powered by Hugging Face transformers and MCP protocol.
Two Versions Available:
- Classic (v1.0): spaCy + web scraping (fast, simple)
- Enhanced (v2.0): HF transformers + semantic search + MCP (accurate, powerful)
- Natural Language Understanding: spaCy NER for entity extraction
- Web Search Integration: SerpAPI for finding event pages
- Smart Scraping: JSON-LD + heuristic extraction
- REST API: FastAPI service
- Advanced NLP: Hugging Face transformers (BART, BERT, DistilBERT)
- Intent Classification: Zero-shot learning for query understanding
- QA-Based Extraction: Extract events from messy HTML using question-answering
- Semantic Ranking: Sentence transformers for relevance scoring
- Semantic Deduplication: Embedding-based duplicate detection
- MCP Protocol: Expose capabilities as orchestratable tools
User Query β NLP Parser β Search Query Builder β Web Search API β
β Web Scraper β Event Extractor β Structured Results
# 1. Install
pip install -r requirements.txt # Classic deps only
python -m spacy download en_core_web_sm
# 2. Configure
cp env.example .env
# Edit .env: SERP_API_KEY=your_key_here
# 3. Run
python app.py # Classic API
# OR
python example.py "Find AI meetups in Boston"# 1. Install (includes HF transformers)
pip install -r requirements.txt # ~2GB of models will download
# 2. Configure (same as classic)
cp env.example .env
# 3. Run
python app_enhanced.py # Enhanced API with HF models
# OR
python event_finder_enhanced.py # Enhanced CLI
# OR
python mcp_server.py # MCP server for tool orchestrationThe enhanced version downloads models automatically on first use (~2GB, ~60s)
python example.py "Find AI meetups in Boston this weekend"curl -X POST "http://localhost:8000/events" \
-H "Content-Type: application/json" \
-d '{"text": "Find AI meetups in Boston this weekend"}'from event_finder import EventFinder
finder = EventFinder()
events = finder.find_events("Find hackathons in NYC in December")
for event in events:
print(f"{event['title']} - {event['start_date']}")
print(f"Register: {event['register_url']}\n")events-extract/
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ .env # Environment variables (create this)
βββ app.py # FastAPI web service
βββ event_finder.py # Main orchestrator
βββ nlp_parser.py # Query parsing logic
βββ search_client.py # Web search integration
βββ event_scraper.py # Web scraping and extraction
βββ example.py # Example usage script
- Uses spaCy for named entity recognition
- Extracts locations (cities, countries)
- Identifies date/time expressions
- Isolates topic keywords
- Builds search queries from parsed slots
- Integrates with SerpAPI (Google, Bing, etc.)
- Returns candidate URLs with snippets
- Fetches and parses HTML content
- Extracts JSON-LD structured data (schema.org/Event)
- Falls back to heuristic-based extraction
- Normalizes event data into standard format
- Orchestrates the entire pipeline
- Filters and deduplicates results
- Handles errors gracefully
- "Find AI meetups in Boston this weekend"
- "Python conferences in San Francisco in December"
- "Hackathons in NYC next month"
- "Machine learning workshops in London"
- "Tech events in Seattle this week"
- Better date range normalization
- Event deduplication by title similarity
- Ranking by relevance and date proximity
- Caching for popular queries
- Support for online vs in-person filtering
- Multi-location support
- Custom site-specific scrapers (Meetup, Eventbrite)
- User preference learning
MIT License - Feel free to use and modify for your projects!