Fixed docker config. Added services
This commit is contained in:
59
Readme.md
59
Readme.md
@@ -2,11 +2,14 @@
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
MyDocManager is a real-time document processing application that automatically detects files in a monitored directory, processes them asynchronously, and stores the results in a database. The application uses a modern microservices architecture with Redis for task queuing and MongoDB for data persistence.
|
MyDocManager is a real-time document processing application that automatically detects files in a monitored directory,
|
||||||
|
processes them asynchronously, and stores the results in a database. The application uses a modern microservices
|
||||||
|
architecture with Redis for task queuing and MongoDB for data persistence.
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
### Technology Stack
|
### Technology Stack
|
||||||
|
|
||||||
- **Backend API**: FastAPI (Python 3.12)
|
- **Backend API**: FastAPI (Python 3.12)
|
||||||
- **Task Processing**: Celery with Redis broker
|
- **Task Processing**: Celery with Redis broker
|
||||||
- **Document Processing**: EasyOCR, PyMuPDF, python-docx, pdfplumber
|
- **Document Processing**: EasyOCR, PyMuPDF, python-docx, pdfplumber
|
||||||
@@ -16,6 +19,7 @@ MyDocManager is a real-time document processing application that automatically d
|
|||||||
- **File Monitoring**: Python watchdog library
|
- **File Monitoring**: Python watchdog library
|
||||||
|
|
||||||
### Services Architecture
|
### Services Architecture
|
||||||
|
|
||||||
┌─────────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
┌─────────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||||||
│ Frontend │ │ file- │ │ Redis │ │ Worker │ │ MongoDB │
|
│ Frontend │ │ file- │ │ Redis │ │ Worker │ │ MongoDB │
|
||||||
│ (React) │◄──►│ processor │───►│ (Broker) │◄──►│ (Celery) │───►│ (Results) │
|
│ (React) │◄──►│ processor │───►│ (Broker) │◄──►│ (Celery) │───►│ (Results) │
|
||||||
@@ -24,13 +28,13 @@ MyDocManager is a real-time document processing application that automatically d
|
|||||||
└─────────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
|
└─────────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
|
||||||
|
|
||||||
### Docker Services
|
### Docker Services
|
||||||
|
|
||||||
1. **file-processor**: FastAPI + real-time file monitoring + Celery task dispatch
|
1. **file-processor**: FastAPI + real-time file monitoring + Celery task dispatch
|
||||||
2. **worker**: Celery workers for document processing (OCR, text extraction)
|
2. **worker**: Celery workers for document processing (OCR, text extraction)
|
||||||
3. **redis**: Message broker for Celery tasks
|
3. **redis**: Message broker for Celery tasks
|
||||||
4. **mongodb**: Final database for processing results
|
4. **mongodb**: Final database for processing results
|
||||||
5. **frontend**: React interface for monitoring and file access
|
5. **frontend**: React interface for monitoring and file access
|
||||||
|
|
||||||
|
|
||||||
## Data Flow
|
## Data Flow
|
||||||
|
|
||||||
1. **File Detection**: Watchdog monitors target directory in real-time
|
1. **File Detection**: Watchdog monitors target directory in real-time
|
||||||
@@ -42,11 +46,13 @@ MyDocManager is a real-time document processing application that automatically d
|
|||||||
## Document Processing Capabilities
|
## Document Processing Capabilities
|
||||||
|
|
||||||
### Supported File Types
|
### Supported File Types
|
||||||
|
|
||||||
- **PDF**: Direct text extraction + OCR for scanned documents
|
- **PDF**: Direct text extraction + OCR for scanned documents
|
||||||
- **Word Documents**: .docx text extraction
|
- **Word Documents**: .docx text extraction
|
||||||
- **Images**: OCR text recognition (JPG, PNG, etc.)
|
- **Images**: OCR text recognition (JPG, PNG, etc.)
|
||||||
|
|
||||||
### Processing Libraries
|
### Processing Libraries
|
||||||
|
|
||||||
- **EasyOCR**: Modern OCR engine (80+ languages, deep learning-based)
|
- **EasyOCR**: Modern OCR engine (80+ languages, deep learning-based)
|
||||||
- **PyMuPDF**: PDF text extraction and manipulation
|
- **PyMuPDF**: PDF text extraction and manipulation
|
||||||
- **python-docx**: Word document processing
|
- **python-docx**: Word document processing
|
||||||
@@ -55,12 +61,15 @@ MyDocManager is a real-time document processing application that automatically d
|
|||||||
## Development Environment
|
## Development Environment
|
||||||
|
|
||||||
### Container-Based Development
|
### Container-Based Development
|
||||||
|
|
||||||
The application is designed for container-based development with hot-reload capabilities:
|
The application is designed for container-based development with hot-reload capabilities:
|
||||||
|
|
||||||
- Source code mounted as volumes for real-time updates
|
- Source code mounted as volumes for real-time updates
|
||||||
- All services orchestrated via Docker Compose
|
- All services orchestrated via Docker Compose
|
||||||
- Development and production parity
|
- Development and production parity
|
||||||
|
|
||||||
### Key Features
|
### Key Features
|
||||||
|
|
||||||
- **Real-time Processing**: Immediate file detection and processing
|
- **Real-time Processing**: Immediate file detection and processing
|
||||||
- **Horizontal Scaling**: Multiple workers can be added easily
|
- **Horizontal Scaling**: Multiple workers can be added easily
|
||||||
- **Fault Tolerance**: Celery provides automatic retry mechanisms
|
- **Fault Tolerance**: Celery provides automatic retry mechanisms
|
||||||
@@ -68,6 +77,7 @@ The application is designed for container-based development with hot-reload capa
|
|||||||
- **Hot Reload**: Development changes reflected instantly in containers
|
- **Hot Reload**: Development changes reflected instantly in containers
|
||||||
|
|
||||||
### Docker Services
|
### Docker Services
|
||||||
|
|
||||||
1. **file-processor**: FastAPI + real-time file monitoring + Celery task dispatch
|
1. **file-processor**: FastAPI + real-time file monitoring + Celery task dispatch
|
||||||
2. **worker**: Celery workers for document processing (OCR, text extraction)
|
2. **worker**: Celery workers for document processing (OCR, text extraction)
|
||||||
3. **redis**: Message broker for Celery tasks
|
3. **redis**: Message broker for Celery tasks
|
||||||
@@ -138,6 +148,7 @@ MyDocManager/
|
|||||||
## Authentication & User Management
|
## Authentication & User Management
|
||||||
|
|
||||||
### Security Features
|
### Security Features
|
||||||
|
|
||||||
- **JWT Authentication**: Stateless authentication with 24-hour token expiration
|
- **JWT Authentication**: Stateless authentication with 24-hour token expiration
|
||||||
- **Password Security**: bcrypt hashing with automatic salting
|
- **Password Security**: bcrypt hashing with automatic salting
|
||||||
- **Role-Based Access**: Admin and User roles with granular permissions
|
- **Role-Based Access**: Admin and User roles with granular permissions
|
||||||
@@ -145,16 +156,19 @@ MyDocManager/
|
|||||||
- **Auto Admin Creation**: Default admin user created on first startup
|
- **Auto Admin Creation**: Default admin user created on first startup
|
||||||
|
|
||||||
### User Roles
|
### User Roles
|
||||||
|
|
||||||
- **Admin**: Full access to user management (create, read, update, delete users)
|
- **Admin**: Full access to user management (create, read, update, delete users)
|
||||||
- **User**: Limited access (view own profile, access document processing features)
|
- **User**: Limited access (view own profile, access document processing features)
|
||||||
|
|
||||||
### Authentication Flow
|
### Authentication Flow
|
||||||
|
|
||||||
1. **Login**: User provides credentials → Server validates → Returns JWT token
|
1. **Login**: User provides credentials → Server validates → Returns JWT token
|
||||||
2. **API Access**: Client includes JWT in Authorization header
|
2. **API Access**: Client includes JWT in Authorization header
|
||||||
3. **Token Validation**: Server verifies token signature and expiration
|
3. **Token Validation**: Server verifies token signature and expiration
|
||||||
4. **Role Check**: Server validates user permissions for requested resource
|
4. **Role Check**: Server validates user permissions for requested resource
|
||||||
|
|
||||||
### User Management APIs
|
### User Management APIs
|
||||||
|
|
||||||
```
|
```
|
||||||
POST /auth/login # Generate JWT token
|
POST /auth/login # Generate JWT token
|
||||||
GET /users # List all users (admin only)
|
GET /users # List all users (admin only)
|
||||||
@@ -164,7 +178,6 @@ DELETE /users/{user_id} # Delete user (admin only)
|
|||||||
GET /users/me # Get current user profile (authenticated users)
|
GET /users/me # Get current user profile (authenticated users)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
## Docker Commands Reference
|
## Docker Commands Reference
|
||||||
|
|
||||||
### Initial Setup & Build
|
### Initial Setup & Build
|
||||||
@@ -248,9 +261,9 @@ docker-compose up --scale worker=3
|
|||||||
### Hot-Reload Configuration
|
### Hot-Reload Configuration
|
||||||
|
|
||||||
- **file-processor**: Hot-reload enabled via `--reload` flag
|
- **file-processor**: Hot-reload enabled via `--reload` flag
|
||||||
- Code changes in `src/file-processor/app/` automatically restart FastAPI
|
- Code changes in `src/file-processor/app/` automatically restart FastAPI
|
||||||
- **worker**: No hot-reload (manual restart required for stability)
|
- **worker**: No hot-reload (manual restart required for stability)
|
||||||
- Code changes in `src/worker/tasks/` require: `docker-compose restart worker`
|
- Code changes in `src/worker/tasks/` require: `docker-compose restart worker`
|
||||||
|
|
||||||
### Useful Service URLs
|
### Useful Service URLs
|
||||||
|
|
||||||
@@ -274,41 +287,48 @@ curl -X POST http://localhost:8000/test-task \
|
|||||||
# Monitor Celery tasks
|
# Monitor Celery tasks
|
||||||
docker-compose logs -f worker
|
docker-compose logs -f worker
|
||||||
```
|
```
|
||||||
|
|
||||||
## Default Admin User
|
## Default Admin User
|
||||||
|
|
||||||
On first startup, the application automatically creates a default admin user:
|
On first startup, the application automatically creates a default admin user:
|
||||||
|
|
||||||
- **Username**: `admin`
|
- **Username**: `admin`
|
||||||
- **Password**: `admin`
|
- **Password**: `admin`
|
||||||
- **Role**: `admin`
|
- **Role**: `admin`
|
||||||
- **Email**: `admin@mydocmanager.local`
|
- **Email**: `admin@mydocmanager.local`
|
||||||
**⚠️ Important**: Change the default admin password immediately after first login in production environments.
|
**⚠️ Important**: Change the default admin password immediately after first login in production environments.
|
||||||
|
|
||||||
## Key Implementation Notes
|
## Key Implementation Notes
|
||||||
|
|
||||||
### Python Standards
|
### Python Standards
|
||||||
|
|
||||||
- **Style**: PEP 8 compliance
|
- **Style**: PEP 8 compliance
|
||||||
- **Documentation**: Google/NumPy docstring format
|
- **Documentation**: Google/NumPy docstring format
|
||||||
- **Naming**: snake_case for variables and functions
|
- **Naming**: snake_case for variables and functions
|
||||||
- **Testing**: pytest with test_i_can_xxx / test_i_cannot_xxx patterns
|
- **Testing**: pytest with test_i_can_xxx / test_i_cannot_xxx patterns
|
||||||
|
|
||||||
### Security Best Practices
|
### Security Best Practices
|
||||||
|
|
||||||
- **Password Storage**: Never store plain text passwords, always use bcrypt hashing
|
- **Password Storage**: Never store plain text passwords, always use bcrypt hashing
|
||||||
- **JWT Secrets**: Use strong, randomly generated secret keys in production
|
- **JWT Secrets**: Use strong, randomly generated secret keys in production
|
||||||
- **Token Expiration**: 24-hour expiration with secure signature validation
|
- **Token Expiration**: 24-hour expiration with secure signature validation
|
||||||
- **Role Validation**: Server-side role checking for all protected endpoints
|
- **Role Validation**: Server-side role checking for all protected endpoints
|
||||||
|
|
||||||
### Dependencies Management
|
### Dependencies Management
|
||||||
|
|
||||||
- **Package Manager**: pip (standard)
|
- **Package Manager**: pip (standard)
|
||||||
- **External Dependencies**: Listed in each service's requirements.txt
|
- **External Dependencies**: Listed in each service's requirements.txt
|
||||||
- **Standard Library First**: Prefer standard library when possible
|
- **Standard Library First**: Prefer standard library when possible
|
||||||
|
|
||||||
### Testing Strategy
|
### Testing Strategy
|
||||||
|
|
||||||
- All code must be testable
|
- All code must be testable
|
||||||
- Unit tests for each authentication and user management function
|
- Unit tests for each authentication and user management function
|
||||||
- Integration tests for complete authentication flow
|
- Integration tests for complete authentication flow
|
||||||
- Tests validated before implementation
|
- Tests validated before implementation
|
||||||
|
|
||||||
### Critical Architecture Decisions Made
|
### Critical Architecture Decisions Made
|
||||||
|
|
||||||
1. **JWT Authentication**: Simple token-based auth with 24-hour expiration
|
1. **JWT Authentication**: Simple token-based auth with 24-hour expiration
|
||||||
2. **Role-Based Access**: Admin/User roles for granular permissions
|
2. **Role-Based Access**: Admin/User roles for granular permissions
|
||||||
3. **bcrypt Password Hashing**: Industry-standard password security
|
3. **bcrypt Password Hashing**: Industry-standard password security
|
||||||
@@ -320,31 +340,24 @@ On first startup, the application automatically creates a default admin user:
|
|||||||
9. **Container Development**: Hot-reload setup required for development workflow
|
9. **Container Development**: Hot-reload setup required for development workflow
|
||||||
|
|
||||||
### Development Process Requirements
|
### Development Process Requirements
|
||||||
|
|
||||||
1. **Collaborative Validation**: All options must be explained before coding
|
1. **Collaborative Validation**: All options must be explained before coding
|
||||||
2. **Test-First Approach**: Test cases defined and validated before implementation
|
2. **Test-First Approach**: Test cases defined and validated before implementation
|
||||||
3. **Incremental Development**: Start simple, extend functionality progressively
|
3. **Incremental Development**: Start simple, extend functionality progressively
|
||||||
4. **Error Handling**: Clear problem explanation required before proposing fixes
|
4. **Error Handling**: Clear problem explanation required before proposing fixes
|
||||||
|
|
||||||
### Next Implementation Steps
|
### Next Implementation Steps
|
||||||
1. ✅ Create docker-compose.yml with all services
|
|
||||||
2. ✅ Define user management and authentication architecture
|
1. ✅ Create docker-compose.yml with all services => Done
|
||||||
3. Implement user models and authentication services
|
2. ✅ Define user management and authentication architecture => Done
|
||||||
4. Create protected API routes for user management
|
3. ✅ Implement user models and authentication services =>
|
||||||
5. Add automatic admin user creation
|
1. models/user.py => Done
|
||||||
|
2. models/auth.py => Done
|
||||||
|
3. database/repositories/user_repository.py => Done
|
||||||
|
4. Add automatic admin user creation if it does not exists
|
||||||
|
5. Create protected API routes for user management
|
||||||
6. Implement basic FastAPI service structure
|
6. Implement basic FastAPI service structure
|
||||||
7. Add watchdog file monitoring
|
7. Add watchdog file monitoring
|
||||||
8. Create Celery task structure
|
8. Create Celery task structure
|
||||||
9. Implement document processing tasks
|
9. Implement document processing tasks
|
||||||
10. Build React monitoring interface with authentication
|
10. Build React monitoring interface with authentication
|
||||||
|
|
||||||
### prochaines étapes
|
|
||||||
MongoDB CRUD
|
|
||||||
Nous devons absolument mocker MongoDB pour les tests unitaires avec pytest-mock
|
|
||||||
Fichiers à créer:
|
|
||||||
* app/models/auht.py => déjà fait
|
|
||||||
* app/models/user.py => déjà fait
|
|
||||||
* app/database/connection.py
|
|
||||||
* Utilise les settings pour l'URL MongoDB. Il faut créer un fichier de configuration (app/config/settings.py)
|
|
||||||
* Fonction get_database() + gestion des erreurs
|
|
||||||
* Configuration via variables d'environnement
|
|
||||||
* app/database/repositories/user_repository.py
|
|
||||||
@@ -1,5 +1,3 @@
|
|||||||
version: '3.8'
|
|
||||||
|
|
||||||
services:
|
services:
|
||||||
# Redis - Message broker for Celery
|
# Redis - Message broker for Celery
|
||||||
redis:
|
redis:
|
||||||
@@ -36,15 +34,16 @@ services:
|
|||||||
environment:
|
environment:
|
||||||
- REDIS_URL=redis://redis:6379/0
|
- REDIS_URL=redis://redis:6379/0
|
||||||
- MONGODB_URL=mongodb://admin:password123@mongodb:27017/mydocmanager?authSource=admin
|
- MONGODB_URL=mongodb://admin:password123@mongodb:27017/mydocmanager?authSource=admin
|
||||||
|
- PYTHONPATH=/app
|
||||||
volumes:
|
volumes:
|
||||||
- ./src/file-processor/app:/app
|
- ./src/file-processor:/app
|
||||||
- ./volumes/watched_files:/watched_files
|
- ./volumes/watched_files:/watched_files
|
||||||
depends_on:
|
depends_on:
|
||||||
- redis
|
- redis
|
||||||
- mongodb
|
- mongodb
|
||||||
networks:
|
networks:
|
||||||
- mydocmanager-network
|
- mydocmanager-network
|
||||||
command: uvicorn main:app --host 0.0.0.0 --port 8000 --reload
|
command: uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
|
||||||
|
|
||||||
# Worker - Celery workers for document processing
|
# Worker - Celery workers for document processing
|
||||||
worker:
|
worker:
|
||||||
@@ -55,6 +54,7 @@ services:
|
|||||||
environment:
|
environment:
|
||||||
- REDIS_URL=redis://redis:6379/0
|
- REDIS_URL=redis://redis:6379/0
|
||||||
- MONGODB_URL=mongodb://admin:password123@mongodb:27017/mydocmanager?authSource=admin
|
- MONGODB_URL=mongodb://admin:password123@mongodb:27017/mydocmanager?authSource=admin
|
||||||
|
- PYTHONPATH=/app
|
||||||
volumes:
|
volumes:
|
||||||
- ./src/worker/tasks:/app
|
- ./src/worker/tasks:/app
|
||||||
- ./volumes/watched_files:/watched_files
|
- ./volumes/watched_files:/watched_files
|
||||||
|
|||||||
@@ -8,10 +8,12 @@ COPY requirements.txt .
|
|||||||
RUN pip install --no-cache-dir -r requirements.txt
|
RUN pip install --no-cache-dir -r requirements.txt
|
||||||
|
|
||||||
# Copy application code
|
# Copy application code
|
||||||
COPY app/ .
|
COPY . .
|
||||||
|
|
||||||
|
ENV PYTHONPATH=/app
|
||||||
|
|
||||||
# Expose port
|
# Expose port
|
||||||
EXPOSE 8000
|
EXPOSE 8000
|
||||||
|
|
||||||
# Command will be overridden by docker-compose
|
# Command will be overridden by docker-compose
|
||||||
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
|
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||||
@@ -11,7 +11,7 @@ from pymongo import MongoClient
|
|||||||
from pymongo.database import Database
|
from pymongo.database import Database
|
||||||
from pymongo.errors import ConnectionFailure, ServerSelectionTimeoutError
|
from pymongo.errors import ConnectionFailure, ServerSelectionTimeoutError
|
||||||
|
|
||||||
from config.settings import get_mongodb_url, get_mongodb_database_name
|
from app.config.settings import get_mongodb_url, get_mongodb_database_name
|
||||||
|
|
||||||
# Global variables for singleton pattern
|
# Global variables for singleton pattern
|
||||||
_client: Optional[MongoClient] = None
|
_client: Optional[MongoClient] = None
|
||||||
|
|||||||
@@ -13,7 +13,7 @@ from pymongo.errors import DuplicateKeyError
|
|||||||
from pymongo.collection import Collection
|
from pymongo.collection import Collection
|
||||||
|
|
||||||
from app.models.user import UserCreate, UserInDB, UserUpdate
|
from app.models.user import UserCreate, UserInDB, UserUpdate
|
||||||
from utils.security import hash_password
|
from app.utils.security import hash_password
|
||||||
|
|
||||||
|
|
||||||
class UserRepository:
|
class UserRepository:
|
||||||
|
|||||||
@@ -4,19 +4,74 @@ FastAPI application for MyDocManager file processor service.
|
|||||||
This service provides API endpoints for health checks and task dispatching.
|
This service provides API endpoints for health checks and task dispatching.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
import os
|
import os
|
||||||
from fastapi import FastAPI, HTTPException
|
from contextlib import asynccontextmanager
|
||||||
|
from fastapi import FastAPI, HTTPException, Depends
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel
|
||||||
import redis
|
import redis
|
||||||
from celery import Celery
|
from celery import Celery
|
||||||
|
|
||||||
from database.connection import test_database_connection
|
from app.database.connection import test_database_connection, get_database
|
||||||
|
from app.database.repositories.user_repository import UserRepository
|
||||||
|
from app.models.user import UserCreate
|
||||||
|
from app.services.init_service import InitializationService
|
||||||
|
from app.services.user_service import UserService
|
||||||
|
|
||||||
|
# Configure logging
|
||||||
|
logging.basicConfig(level=logging.INFO)
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
@asynccontextmanager
|
||||||
|
async def lifespan(app: FastAPI):
|
||||||
|
"""
|
||||||
|
Application lifespan manager for startup and shutdown tasks.
|
||||||
|
|
||||||
|
Handles initialization tasks that need to run when the application starts,
|
||||||
|
including admin user creation and other setup procedures.
|
||||||
|
"""
|
||||||
|
# Startup tasks
|
||||||
|
logger.info("Starting MyDocManager application...")
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Initialize database connection
|
||||||
|
database = get_database()
|
||||||
|
|
||||||
|
# Initialize repositories and services
|
||||||
|
user_repository = UserRepository(database)
|
||||||
|
user_service = UserService(user_repository)
|
||||||
|
init_service = InitializationService(user_service)
|
||||||
|
|
||||||
|
# Run initialization tasks
|
||||||
|
initialization_result = init_service.initialize_application()
|
||||||
|
|
||||||
|
if initialization_result["initialization_success"]:
|
||||||
|
logger.info("Application startup completed successfully")
|
||||||
|
if initialization_result["admin_user_created"]:
|
||||||
|
logger.info("Default admin user was created during startup")
|
||||||
|
else:
|
||||||
|
logger.error("Application startup completed with errors:")
|
||||||
|
for error in initialization_result["errors"]:
|
||||||
|
logger.error(f" - {error}")
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Critical error during application startup: {str(e)}")
|
||||||
|
# You might want to decide if the app should continue or exit here
|
||||||
|
# For now, we log the error but continue
|
||||||
|
|
||||||
|
yield # Application is running
|
||||||
|
|
||||||
|
# Shutdown tasks (if needed)
|
||||||
|
logger.info("Shutting down MyDocManager application...")
|
||||||
|
|
||||||
|
|
||||||
# Initialize FastAPI app
|
# Initialize FastAPI app
|
||||||
app = FastAPI(
|
app = FastAPI(
|
||||||
title="MyDocManager File Processor",
|
title="MyDocManager File Processor",
|
||||||
description="File processing and task dispatch service",
|
description="File processing and task dispatch service",
|
||||||
version="1.0.0"
|
version="1.0.0",
|
||||||
|
lifespan=lifespan
|
||||||
)
|
)
|
||||||
|
|
||||||
# Environment variables
|
# Environment variables
|
||||||
@@ -44,6 +99,27 @@ class TestTaskRequest(BaseModel):
|
|||||||
message: str
|
message: str
|
||||||
|
|
||||||
|
|
||||||
|
def get_user_service() -> UserService:
|
||||||
|
"""
|
||||||
|
Dependency to get user service instance.
|
||||||
|
|
||||||
|
This should be properly implemented with database connection management
|
||||||
|
in your actual application.
|
||||||
|
"""
|
||||||
|
database = get_database()
|
||||||
|
user_repository = UserRepository(database)
|
||||||
|
return UserService(user_repository)
|
||||||
|
|
||||||
|
|
||||||
|
# Your API routes would use the service like this:
|
||||||
|
@app.post("/api/users")
|
||||||
|
async def create_user(
|
||||||
|
user_data: UserCreate,
|
||||||
|
user_service: UserService = Depends(get_user_service)
|
||||||
|
):
|
||||||
|
return user_service.create_user(user_data)
|
||||||
|
|
||||||
|
|
||||||
@app.get("/health")
|
@app.get("/health")
|
||||||
async def health_check():
|
async def health_check():
|
||||||
"""
|
"""
|
||||||
|
|||||||
@@ -100,14 +100,19 @@ def validate_username_not_empty(username: str) -> str:
|
|||||||
return username.strip()
|
return username.strip()
|
||||||
|
|
||||||
|
|
||||||
class UserCreate(BaseModel):
|
class UserCreateNoValidation(BaseModel):
|
||||||
"""Model for creating a new user."""
|
"""Model for creating a new user."""
|
||||||
|
|
||||||
username: str
|
username: str
|
||||||
email: EmailStr
|
email: str
|
||||||
password: str
|
password: str
|
||||||
role: UserRole = UserRole.USER
|
role: UserRole = UserRole.USER
|
||||||
|
|
||||||
|
|
||||||
|
class UserCreate(UserCreateNoValidation):
|
||||||
|
"""Model for creating a new user."""
|
||||||
|
email: EmailStr
|
||||||
|
|
||||||
@field_validator('username')
|
@field_validator('username')
|
||||||
@classmethod
|
@classmethod
|
||||||
def validate_username(cls, v):
|
def validate_username(cls, v):
|
||||||
|
|||||||
0
src/file-processor/app/services/__init__.py
Normal file
0
src/file-processor/app/services/__init__.py
Normal file
58
src/file-processor/app/services/auth_service.py
Normal file
58
src/file-processor/app/services/auth_service.py
Normal file
@@ -0,0 +1,58 @@
|
|||||||
|
"""
|
||||||
|
Authentication service for password hashing and verification.
|
||||||
|
|
||||||
|
This module provides authentication-related functionality including
|
||||||
|
password hashing, verification, and JWT token management.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from app.utils.security import hash_password, verify_password
|
||||||
|
|
||||||
|
|
||||||
|
class AuthService:
|
||||||
|
"""
|
||||||
|
Service class for authentication operations.
|
||||||
|
|
||||||
|
Handles password hashing, verification, and other authentication
|
||||||
|
related operations with proper security practices.
|
||||||
|
"""
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def hash_user_password(password: str) -> str:
|
||||||
|
"""
|
||||||
|
Hash a plaintext password for secure storage.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
password (str): Plaintext password to hash
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
str: Hashed password safe for database storage
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> auth = AuthService()
|
||||||
|
>>> hashed = auth.hash_user_password("mypassword123")
|
||||||
|
>>> len(hashed) > 0
|
||||||
|
True
|
||||||
|
"""
|
||||||
|
return hash_password(password)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def verify_user_password(password: str, hashed_password: str) -> bool:
|
||||||
|
"""
|
||||||
|
Verify a password against its hash.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
password (str): Plaintext password to verify
|
||||||
|
hashed_password (str): Stored hashed password
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
bool: True if password matches hash, False otherwise
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> auth = AuthService()
|
||||||
|
>>> hashed = auth.hash_user_password("mypassword123")
|
||||||
|
>>> auth.verify_user_password("mypassword123", hashed)
|
||||||
|
True
|
||||||
|
>>> auth.verify_user_password("wrongpassword", hashed)
|
||||||
|
False
|
||||||
|
"""
|
||||||
|
return verify_password(password, hashed_password)
|
||||||
134
src/file-processor/app/services/init_service.py
Normal file
134
src/file-processor/app/services/init_service.py
Normal file
@@ -0,0 +1,134 @@
|
|||||||
|
"""
|
||||||
|
Initialization service for application startup tasks.
|
||||||
|
|
||||||
|
This module handles application initialization tasks including
|
||||||
|
creating default admin user if none exists.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
from app.models.user import UserCreate, UserInDB, UserCreateNoValidation
|
||||||
|
from app.models.auth import UserRole
|
||||||
|
from app.services.user_service import UserService
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class InitializationService:
|
||||||
|
"""
|
||||||
|
Service for handling application initialization tasks.
|
||||||
|
|
||||||
|
This service manages startup operations like ensuring required
|
||||||
|
users exist and system is properly configured.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, user_service: UserService):
|
||||||
|
"""
|
||||||
|
Initialize service with user service dependency.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
user_service (UserService): Service for user operations
|
||||||
|
"""
|
||||||
|
self.user_service = user_service
|
||||||
|
|
||||||
|
|
||||||
|
def ensure_admin_user_exists(self) -> Optional[UserInDB]:
|
||||||
|
"""
|
||||||
|
Ensure default admin user exists in the system.
|
||||||
|
|
||||||
|
Creates a default admin user if no admin user exists in the system.
|
||||||
|
Uses default credentials that should be changed after first login.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
UserInDB or None: Created admin user if created, None if already exists
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
Exception: If admin user creation fails
|
||||||
|
"""
|
||||||
|
logger.info("Checking if admin user exists...")
|
||||||
|
|
||||||
|
# Check if any admin user already exists
|
||||||
|
if self._admin_user_exists():
|
||||||
|
logger.info("Admin user already exists, skipping creation")
|
||||||
|
return None
|
||||||
|
|
||||||
|
logger.info("No admin user found, creating default admin user...")
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Create default admin user
|
||||||
|
admin_data = UserCreateNoValidation(
|
||||||
|
username="admin",
|
||||||
|
email="admin@mydocmanager.local",
|
||||||
|
password="admin", # Should be changed after first login
|
||||||
|
role=UserRole.ADMIN
|
||||||
|
)
|
||||||
|
|
||||||
|
created_user = self.user_service.create_user(admin_data)
|
||||||
|
logger.info(f"Default admin user created successfully with ID: {created_user.id}")
|
||||||
|
logger.warning(
|
||||||
|
"Default admin user created with username 'admin' and password 'admin'. "
|
||||||
|
"Please change these credentials immediately for security!"
|
||||||
|
)
|
||||||
|
|
||||||
|
return created_user
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Failed to create default admin user: {str(e)}")
|
||||||
|
raise Exception(f"Admin user creation failed: {str(e)}")
|
||||||
|
|
||||||
|
def _admin_user_exists(self) -> bool:
|
||||||
|
"""
|
||||||
|
Check if any admin user exists in the system.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
bool: True if at least one admin user exists, False otherwise
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
# Get all users and check if any have admin role
|
||||||
|
users = self.user_service.list_users(limit=1000) # Reasonable limit for admin check
|
||||||
|
|
||||||
|
for user in users:
|
||||||
|
if user.role == UserRole.ADMIN and user.is_active:
|
||||||
|
return True
|
||||||
|
|
||||||
|
return False
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.error(f"Error checking for admin users: {str(e)}")
|
||||||
|
# In case of error, assume admin exists to avoid creating duplicates
|
||||||
|
return True
|
||||||
|
|
||||||
|
def initialize_application(self) -> dict:
|
||||||
|
"""
|
||||||
|
Perform all application initialization tasks.
|
||||||
|
|
||||||
|
This method runs all necessary initialization procedures including
|
||||||
|
admin user creation and any other startup requirements.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
dict: Summary of initialization tasks performed
|
||||||
|
"""
|
||||||
|
logger.info("Starting application initialization...")
|
||||||
|
|
||||||
|
initialization_summary = {
|
||||||
|
"admin_user_created": False,
|
||||||
|
"initialization_success": False,
|
||||||
|
"errors": []
|
||||||
|
}
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Ensure admin user exists
|
||||||
|
created_admin = self.ensure_admin_user_exists()
|
||||||
|
if created_admin:
|
||||||
|
initialization_summary["admin_user_created"] = True
|
||||||
|
|
||||||
|
initialization_summary["initialization_success"] = True
|
||||||
|
logger.info("Application initialization completed successfully")
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
error_msg = f"Application initialization failed: {str(e)}"
|
||||||
|
logger.error(error_msg)
|
||||||
|
initialization_summary["errors"].append(error_msg)
|
||||||
|
|
||||||
|
return initialization_summary
|
||||||
181
src/file-processor/app/services/user_service.py
Normal file
181
src/file-processor/app/services/user_service.py
Normal file
@@ -0,0 +1,181 @@
|
|||||||
|
"""
|
||||||
|
User service for business logic operations.
|
||||||
|
|
||||||
|
This module provides user-related business logic including user creation,
|
||||||
|
retrieval, updates, and authentication operations with proper error handling.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from typing import Optional, List
|
||||||
|
from pymongo.errors import DuplicateKeyError
|
||||||
|
|
||||||
|
from app.models.user import UserCreate, UserInDB, UserUpdate, UserResponse, UserCreateNoValidation
|
||||||
|
from app.models.auth import UserRole
|
||||||
|
from app.database.repositories.user_repository import UserRepository
|
||||||
|
from app.services.auth_service import AuthService
|
||||||
|
|
||||||
|
|
||||||
|
class UserService:
|
||||||
|
"""
|
||||||
|
Service class for user business logic operations.
|
||||||
|
|
||||||
|
This class handles user-related operations including creation,
|
||||||
|
authentication, and data management with proper validation.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, user_repository: UserRepository):
|
||||||
|
"""
|
||||||
|
Initialize user service with repository dependency.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
user_repository (UserRepository): Repository for user data operations
|
||||||
|
"""
|
||||||
|
self.user_repository = user_repository
|
||||||
|
self.auth_service = AuthService()
|
||||||
|
|
||||||
|
def create_user(self, user_data: UserCreate | UserCreateNoValidation) -> UserInDB:
|
||||||
|
"""
|
||||||
|
Create a new user with business logic validation.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
user_data (UserCreate): User creation data
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
UserInDB: Created user with database information
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
ValueError: If user already exists or validation fails
|
||||||
|
"""
|
||||||
|
# Check if user already exists
|
||||||
|
if self.user_repository.user_exists(user_data.username):
|
||||||
|
raise ValueError(f"User with username '{user_data.username}' already exists")
|
||||||
|
|
||||||
|
# Check if email already exists
|
||||||
|
existing_user = self.user_repository.find_user_by_email(user_data.email)
|
||||||
|
if existing_user:
|
||||||
|
raise ValueError(f"User with email '{user_data.email}' already exists")
|
||||||
|
|
||||||
|
try:
|
||||||
|
return self.user_repository.create_user(user_data)
|
||||||
|
except DuplicateKeyError:
|
||||||
|
raise ValueError(f"User with username '{user_data.username}' already exists")
|
||||||
|
|
||||||
|
def get_user_by_username(self, username: str) -> Optional[UserInDB]:
|
||||||
|
"""
|
||||||
|
Retrieve user by username.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
username (str): Username to search for
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
UserInDB or None: User if found, None otherwise
|
||||||
|
"""
|
||||||
|
return self.user_repository.find_user_by_username(username)
|
||||||
|
|
||||||
|
def get_user_by_id(self, user_id: str) -> Optional[UserInDB]:
|
||||||
|
"""
|
||||||
|
Retrieve user by ID.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
user_id (str): User ID to search for
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
UserInDB or None: User if found, None otherwise
|
||||||
|
"""
|
||||||
|
return self.user_repository.find_user_by_id(user_id)
|
||||||
|
|
||||||
|
def authenticate_user(self, username: str, password: str) -> Optional[UserInDB]:
|
||||||
|
"""
|
||||||
|
Authenticate user with username and password.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
username (str): Username for authentication
|
||||||
|
password (str): Password for authentication
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
UserInDB or None: Authenticated user if valid, None otherwise
|
||||||
|
"""
|
||||||
|
user = self.user_repository.find_user_by_username(username)
|
||||||
|
if not user:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if not user.is_active:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if not self.auth_service.verify_user_password(password, user.hashed_password):
|
||||||
|
return None
|
||||||
|
|
||||||
|
return user
|
||||||
|
|
||||||
|
def update_user(self, user_id: str, user_update: UserUpdate) -> Optional[UserInDB]:
|
||||||
|
"""
|
||||||
|
Update user information.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
user_id (str): User ID to update
|
||||||
|
user_update (UserUpdate): Updated user data
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
UserInDB or None: Updated user if successful, None otherwise
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
ValueError: If username or email already exists for different user
|
||||||
|
"""
|
||||||
|
# Validate username uniqueness if being updated
|
||||||
|
if user_update.username is not None:
|
||||||
|
existing_user = self.user_repository.find_user_by_username(user_update.username)
|
||||||
|
if existing_user and str(existing_user.id) != user_id:
|
||||||
|
raise ValueError(f"Username '{user_update.username}' is already taken")
|
||||||
|
|
||||||
|
# Validate email uniqueness if being updated
|
||||||
|
if user_update.email is not None:
|
||||||
|
existing_user = self.user_repository.find_user_by_email(user_update.email)
|
||||||
|
if existing_user and str(existing_user.id) != user_id:
|
||||||
|
raise ValueError(f"Email '{user_update.email}' is already taken")
|
||||||
|
|
||||||
|
return self.user_repository.update_user(user_id, user_update)
|
||||||
|
|
||||||
|
def delete_user(self, user_id: str) -> bool:
|
||||||
|
"""
|
||||||
|
Delete user from system.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
user_id (str): User ID to delete
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
bool: True if user was deleted, False otherwise
|
||||||
|
"""
|
||||||
|
return self.user_repository.delete_user(user_id)
|
||||||
|
|
||||||
|
def list_users(self, skip: int = 0, limit: int = 100) -> List[UserInDB]:
|
||||||
|
"""
|
||||||
|
List users with pagination.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
skip (int): Number of users to skip (default: 0)
|
||||||
|
limit (int): Maximum number of users to return (default: 100)
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List[UserInDB]: List of users
|
||||||
|
"""
|
||||||
|
return self.user_repository.list_users(skip=skip, limit=limit)
|
||||||
|
|
||||||
|
def count_users(self) -> int:
|
||||||
|
"""
|
||||||
|
Count total number of users.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
int: Total number of users in system
|
||||||
|
"""
|
||||||
|
return self.user_repository.count_users()
|
||||||
|
|
||||||
|
def user_exists(self, username: str) -> bool:
|
||||||
|
"""
|
||||||
|
Check if user exists by username.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
username (str): Username to check
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
bool: True if user exists, False otherwise
|
||||||
|
"""
|
||||||
|
return self.user_repository.user_exists(username)
|
||||||
@@ -1,6 +1,9 @@
|
|||||||
fastapi==0.116.1
|
bcrypt==4.3.0
|
||||||
uvicorn==0.35.0
|
|
||||||
celery==5.5.3
|
celery==5.5.3
|
||||||
redis==6.4.0
|
email-validator==2.3.0
|
||||||
|
fastapi==0.116.1
|
||||||
|
httptools==0.6.4
|
||||||
pymongo==4.15.0
|
pymongo==4.15.0
|
||||||
pydantic==2.11.9
|
pydantic==2.11.9
|
||||||
|
redis==6.4.0
|
||||||
|
uvicorn==0.35.0
|
||||||
|
|||||||
Reference in New Issue
Block a user