Files
werkzeuge/teamleader_test/docs/architecture/overview.md
root cb073786b3 Initial commit: Werkzeuge-Sammlung
Enthält:
- rdp_client.py: RDP Client mit GUI und Monitor-Auswahl
- rdp.sh: Bash-basierter RDP Client
- teamleader_test/: Network Scanner Fullstack-App
- teamleader_test2/: Network Mapper CLI

Subdirectories mit eigenem Repo wurden ausgeschlossen.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 09:39:24 +01:00

26 KiB

Network Scanning and Visualization Tool - Architecture Design

Executive Summary

This document outlines the architecture for a network scanning and visualization tool that discovers hosts on a local network, collects network information, and presents it through an interactive web interface with Visio-style diagrams.

1. Technology Stack

Backend

  • Language: Python 3.10+

    • Rich ecosystem for network tools
    • Excellent library support
    • Cross-platform compatibility
    • Easy integration with system tools
  • Web Framework: FastAPI

    • Modern, fast async support
    • Built-in WebSocket support for real-time updates
    • Automatic API documentation
    • Type hints for better code quality
  • Network Scanning:

    • python-nmap - Python wrapper for nmap
    • scapy - Packet manipulation (fallback, requires privileges)
    • socket library - Basic connectivity checks (no root needed)
    • netifaces - Network interface enumeration
  • Service Detection:

    • python-nmap with service/version detection
    • Custom banner grabbing for common ports
    • shodan (optional) for service fingerprinting

Frontend

  • Framework: React 18+ with TypeScript

    • Component-based architecture
    • Strong typing for reliability
    • Large ecosystem
    • Excellent performance
  • Visualization:

    • Primary: react-flow or xyflow
      • Modern, maintained library
      • Built for interactive diagrams
      • Great performance with many nodes
      • Drag-and-drop, zoom, pan built-in
    • Alternative: D3.js with d3-force for force-directed graphs
    • Export: html2canvas + jsPDF for PDF export
  • UI Framework:

    • Material-UI (MUI) or shadcn/ui
    • Responsive design
    • Professional appearance
  • State Management:

    • Zustand or Redux Toolkit
    • WebSocket integration for real-time updates

Data Storage

  • Primary: SQLite

    • No separate server needed
    • Perfect for single-user/small team
    • Easy backup (single file)
    • Fast for this use case
  • ORM: SQLAlchemy

    • Powerful query builder
    • Migration support with Alembic
    • Type-safe with Pydantic models
  • Cache: Redis (optional)

    • Cache scan results
    • Rate limiting
    • Session management

Deployment

  • Development:

    • Docker Compose for easy setup
    • Hot reload for both frontend and backend
  • Production:

    • Single Docker container or native install
    • Nginx as reverse proxy
    • systemd service file

2. High-Level Architecture

┌─────────────────────────────────────────────────────────────┐
│                        Web Browser                           │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │   Dashboard  │  │   Network    │  │   Settings   │      │
│  │              │  │   Diagram    │  │              │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
└───────────────────────┬─────────────────────────────────────┘
                        │ HTTP/WebSocket
                        ▼
┌─────────────────────────────────────────────────────────────┐
│                      FastAPI Backend                         │
│  ┌──────────────────────────────────────────────────────┐   │
│  │              REST API Endpoints                       │   │
│  │  /scan, /hosts, /topology, /export                   │   │
│  └──────────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────────┐   │
│  │              WebSocket Handler                        │   │
│  │  (Real-time scan progress and updates)               │   │
│  └──────────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────────┐   │
│  │              Business Logic Layer                     │   │
│  │  ┌────────────┐  ┌────────────┐  ┌────────────┐     │   │
│  │  │  Scanner   │  │  Topology  │  │  Exporter  │     │   │
│  │  │  Manager   │  │  Analyzer  │  │            │     │   │
│  │  └────────────┘  └────────────┘  └────────────┘     │   │
│  └──────────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────────┐   │
│  │              Scanning Engine                          │   │
│  │  ┌────────────┐  ┌────────────┐  ┌────────────┐     │   │
│  │  │   Nmap     │  │   Socket   │  │  Service   │     │   │
│  │  │  Scanner   │  │  Scanner   │  │  Detector  │     │   │
│  │  └────────────┘  └────────────┘  └────────────┘     │   │
│  └──────────────────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────────────────┐   │
│  │              Data Access Layer                        │   │
│  │  (SQLAlchemy ORM + Pydantic Models)                  │   │
│  └──────────────────────────────────────────────────────┘   │
└───────────────────────┬─────────────────────────────────────┘
                        ▼
┌─────────────────────────────────────────────────────────────┐
│                    SQLite Database                           │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
│  │  Hosts   │  │  Ports   │  │  Scans   │  │ Topology │   │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘   │
└─────────────────────────────────────────────────────────────┘

Component Responsibilities

Frontend Components

  1. Dashboard: Overview, scan statistics, recently discovered hosts
  2. Network Diagram: Interactive visualization with zoom/pan/drag
  3. Host Details: Detailed view of individual hosts
  4. Scan Manager: Configure and trigger scans
  5. Settings: Network ranges, scan profiles, preferences

Backend Components

  1. Scanner Manager: Orchestrates scanning operations, manages scan queue
  2. Topology Analyzer: Detects relationships and connections between hosts
  3. Exporter: Generates PDF, PNG, JSON exports
  4. WebSocket Handler: Pushes real-time updates to clients

3. Network Scanning Approach

Scanning Strategy (No Root Required)

Phase 1: Host Discovery

# Primary method: TCP SYN scan to common ports (no root)
Target ports: 22, 80, 443, 445, 3389, 8080
Method: Socket connect() with timeout
Parallelization: ThreadPoolExecutor with ~50 workers

Advantages:

  • No root required
  • Reliable on most networks
  • Fast with parallelization

Implementation:

import socket
from concurrent.futures import ThreadPoolExecutor

def check_host(ip: str, ports: list[int] = [22, 80, 443]) -> bool:
    for port in ports:
        try:
            sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            sock.settimeout(1)
            result = sock.connect_ex((ip, port))
            sock.close()
            if result == 0:
                return True
        except:
            continue
    return False

Phase 2: Port Scanning (with nmap fallback)

Option A: Without Root (Preferred)

# Use python-nmap with -sT (TCP connect scan)
# Or implement custom TCP connect scanner
nmap_args = "-sT -p 1-1000 --open -T4"

Option B: With Root (Better accuracy)

# Use nmap with SYN scan
nmap_args = "-sS -p 1-65535 --open -T4"

Scanning Profiles:

  1. Quick Scan: Top 100 ports, 254 hosts in ~30 seconds
  2. Standard Scan: Top 1000 ports, ~2-3 minutes
  3. Deep Scan: All 65535 ports, ~15-20 minutes
  4. Custom: User-defined port ranges

Phase 3: Service Detection

# Service version detection
nmap_args += " -sV"

# OS detection (requires root, optional)
# nmap_args += " -O"

# Custom banner grabbing for common services
def grab_banner(ip: str, port: int) -> str:
    sock = socket.socket()
    sock.settimeout(3)
    sock.connect((ip, port))
    banner = sock.recv(1024).decode('utf-8', errors='ignore')
    sock.close()
    return banner

Phase 4: DNS Resolution

import socket

def resolve_hostname(ip: str) -> str:
    try:
        return socket.gethostbyaddr(ip)[0]
    except:
        return None

Connection Detection

Passive Methods (no root needed):

  1. Traceroute Analysis: Detect gateway/routing paths
  2. TTL Analysis: Group hosts by TTL to infer network segments
  3. Response Time: Measure latency patterns
  4. Port Patterns: Hosts with similar open ports likely same segment

Active Methods (require root):

  1. ARP Cache: Parse ARP table for MAC addresses
  2. Packet Sniffing: Capture traffic with scapy (requires root)

Recommended Approach:

# Detect default gateway
import netifaces

def get_default_gateway():
    gws = netifaces.gateways()
    return gws['default'][netifaces.AF_INET][0]

# Infer topology based on scanning data
def infer_topology(hosts):
    gateway = get_default_gateway()
    
    topology = {
        'gateway': gateway,
        'segments': [],
        'connections': []
    }
    
    # Group hosts by response characteristics
    # Connect hosts to gateway
    # Detect server-client relationships (open ports)
    
    return topology

Safety Considerations

  1. Rate Limiting: Max 50 concurrent connections, 1-2 second delays
  2. Timeout Control: 1-3 second socket timeouts
  3. Scan Scope: Only scan RFC1918 private ranges by default
  4. User Consent: Clear warnings about network scanning
  5. Logging: Comprehensive audit trail

4. Visualization Strategy

Graph Layout

Primary Algorithm: Force-Directed Layout

  • Library: D3-force or react-flow's built-in layouts
  • Advantages: Natural, organic appearance; automatic spacing
  • Best for: Networks with < 100 nodes

Alternative Algorithms:

  1. Hierarchical (Layered): Gateway at top, subnets in layers
  2. Circular: Hosts arranged in circles by subnet
  3. Grid: Organized grid layout for large networks

Visual Design

Node Representation

{
  id: string,
  type: 'gateway' | 'server' | 'workstation' | 'device' | 'unknown',
  position: { x, y },
  data: {
    ip: string,
    hostname: string,
    openPorts: number[],
    services: Service[],
    status: 'online' | 'offline' | 'scanning'
  }
}

Visual Properties:

  • Shape:
    • Gateway: Diamond
    • Server: Cylinder/Rectangle
    • Workstation: Monitor icon
    • Device: Circle
  • Color:
    • By status (green=online, red=offline, yellow=scanning)
    • Or by type
  • Size: Proportional to number of open ports
  • Labels: IP + hostname (if available)

Edge Representation

{
  id: string,
  source: string,
  target: string,
  type: 'network' | 'service',
  data: {
    latency: number,
    bandwidth: number // if detected
  }
}

Visual Properties:

  • Width: Connection strength/frequency
  • Color: Connection type
  • Style: Solid for confirmed, dashed for inferred
  • Animation: Pulse effect for active scanning

Interactive Features

  1. Node Interactions:

    • Click: Show host details panel
    • Hover: Tooltip with quick info
    • Drag: Reposition (sticky after drop)
    • Double-click: Focus/isolate node
  2. Canvas Interactions:

    • Pan: Click and drag background
    • Zoom: Mouse wheel or pinch
    • Minimap: Overview navigator
    • Selection: Lasso or box select
  3. Controls:

    • Layout algorithm selector
    • Filter by: type, status, ports
    • Search/highlight hosts
    • Export button
    • Refresh/rescan

React-Flow Implementation Example

import ReactFlow, {
  Node,
  Edge,
  Controls,
  MiniMap,
  Background
} from 'reactflow';
import 'reactflow/dist/style.css';

const NetworkDiagram: React.FC = () => {
  const [nodes, setNodes] = useState<Node[]>([]);
  const [edges, setEdges] = useState<Edge[]>([]);

  useEffect(() => {
    // Fetch topology from API
    fetch('/api/topology')
      .then(r => r.json())
      .then(data => {
        setNodes(data.nodes);
        setEdges(data.edges);
      });
  }, []);

  return (
    <ReactFlow
      nodes={nodes}
      edges={edges}
      onNodeClick={handleNodeClick}
      fitView
    >
      <Controls />
      <MiniMap />
      <Background />
    </ReactFlow>
  );
};

5. Data Model

Database Schema

-- Scans table: Track scanning operations
CREATE TABLE scans (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    started_at TIMESTAMP NOT NULL,
    completed_at TIMESTAMP,
    scan_type VARCHAR(50), -- 'quick', 'standard', 'deep', 'custom'
    network_range VARCHAR(100), -- '192.168.1.0/24'
    status VARCHAR(20), -- 'running', 'completed', 'failed'
    hosts_found INTEGER DEFAULT 0,
    ports_scanned INTEGER DEFAULT 0,
    error_message TEXT
);

-- Hosts table: Discovered network hosts
CREATE TABLE hosts (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    ip_address VARCHAR(45) NOT NULL UNIQUE, -- Support IPv4 and IPv6
    hostname VARCHAR(255),
    mac_address VARCHAR(17),
    first_seen TIMESTAMP NOT NULL,
    last_seen TIMESTAMP NOT NULL,
    status VARCHAR(20), -- 'online', 'offline'
    os_guess VARCHAR(255),
    device_type VARCHAR(50), -- 'gateway', 'server', 'workstation', etc.
    vendor VARCHAR(255), -- Based on MAC OUI lookup
    notes TEXT,
    
    INDEX idx_ip (ip_address),
    INDEX idx_status (status),
    INDEX idx_last_seen (last_seen)
);

-- Ports table: Open ports for each host
CREATE TABLE ports (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    host_id INTEGER NOT NULL,
    port_number INTEGER NOT NULL,
    protocol VARCHAR(10) DEFAULT 'tcp', -- 'tcp', 'udp'
    state VARCHAR(20), -- 'open', 'closed', 'filtered'
    service_name VARCHAR(100),
    service_version VARCHAR(255),
    banner TEXT,
    first_seen TIMESTAMP NOT NULL,
    last_seen TIMESTAMP NOT NULL,
    
    FOREIGN KEY (host_id) REFERENCES hosts(id) ON DELETE CASCADE,
    UNIQUE(host_id, port_number, protocol),
    INDEX idx_host_port (host_id, port_number)
);

-- Connections table: Detected relationships between hosts
CREATE TABLE connections (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    source_host_id INTEGER NOT NULL,
    target_host_id INTEGER NOT NULL,
    connection_type VARCHAR(50), -- 'gateway', 'same_subnet', 'service'
    confidence FLOAT, -- 0.0 to 1.0
    detected_at TIMESTAMP NOT NULL,
    last_verified TIMESTAMP,
    metadata JSON, -- Additional connection details
    
    FOREIGN KEY (source_host_id) REFERENCES hosts(id) ON DELETE CASCADE,
    FOREIGN KEY (target_host_id) REFERENCES hosts(id) ON DELETE CASCADE,
    INDEX idx_source (source_host_id),
    INDEX idx_target (target_host_id)
);

-- Scan results: Many-to-many relationship
CREATE TABLE scan_hosts (
    scan_id INTEGER NOT NULL,
    host_id INTEGER NOT NULL,
    
    FOREIGN KEY (scan_id) REFERENCES scans(id) ON DELETE CASCADE,
    FOREIGN KEY (host_id) REFERENCES hosts(id) ON DELETE CASCADE,
    PRIMARY KEY (scan_id, host_id)
);

-- Settings table: Application configuration
CREATE TABLE settings (
    key VARCHAR(100) PRIMARY KEY,
    value TEXT NOT NULL,
    updated_at TIMESTAMP NOT NULL
);

Pydantic Models (API)

from pydantic import BaseModel, IPvAnyAddress
from datetime import datetime
from typing import Optional, List

class PortInfo(BaseModel):
    port_number: int
    protocol: str = "tcp"
    state: str
    service_name: Optional[str]
    service_version: Optional[str]
    banner: Optional[str]

class HostBase(BaseModel):
    ip_address: str
    hostname: Optional[str]
    mac_address: Optional[str]

class HostCreate(HostBase):
    pass

class Host(HostBase):
    id: int
    first_seen: datetime
    last_seen: datetime
    status: str
    device_type: Optional[str]
    os_guess: Optional[str]
    vendor: Optional[str]
    ports: List[PortInfo] = []
    
    class Config:
        from_attributes = True

class Connection(BaseModel):
    id: int
    source_host_id: int
    target_host_id: int
    connection_type: str
    confidence: float
    
class TopologyNode(BaseModel):
    id: str
    type: str
    position: dict
    data: dict

class TopologyEdge(BaseModel):
    id: str
    source: str
    target: str
    type: str
    
class Topology(BaseModel):
    nodes: List[TopologyNode]
    edges: List[TopologyEdge]

class ScanConfig(BaseModel):
    network_range: str
    scan_type: str = "quick"
    port_range: Optional[str] = None
    include_service_detection: bool = True
    
class ScanStatus(BaseModel):
    scan_id: int
    status: str
    progress: float  # 0.0 to 1.0
    hosts_found: int
    current_host: Optional[str]

6. Security and Ethical Considerations

  1. Authorized Access Only:

    • Display prominent warning on first launch
    • Require explicit confirmation to scan
    • Default to scanning only local subnet
    • Log all scanning activities
  2. Privacy:

    • Don't store sensitive data (passwords, traffic content)
    • Encrypt database if storing on shared systems
    • Clear privacy policy
  3. Network Impact:

    • Rate limiting to prevent network disruption
    • Respect robots.txt and similar mechanisms
    • Provide "stealth mode" with slower scans

Application Security

  1. Authentication (if multi-user):

    # JWT-based authentication
    # Or simple API key for single-user
    
  2. Input Validation:

    import ipaddress
    
    def validate_network_range(network: str) -> bool:
        try:
            net = ipaddress.ip_network(network)
            # Only allow private ranges
            return net.is_private
        except ValueError:
            return False
    
  3. Command Injection Prevention:

    # Never use shell=True
    # Sanitize all inputs to nmap
    import shlex
    
    def safe_nmap_scan(target: str):
        # Validate target
        if not validate_ip(target):
            raise ValueError("Invalid target")
    
        # Use subprocess safely
        cmd = ["nmap", "-sT", target]
        result = subprocess.run(cmd, capture_output=True)
    
  4. API Security:

    • CORS configuration for production
    • Rate limiting on scan endpoints
    • Request validation with Pydantic
    • HTTPS in production
  5. File System Security:

    • Restrict database file permissions (600)
    • Validate export file paths
    • Limit export file sizes

Deployment Security

  1. Docker Security:

    # Run as non-root user
    USER appuser
    
    # Drop unnecessary capabilities
    # No --privileged flag unless explicitly needed for root scans
    
  2. Network Isolation:

    • Run in Docker network
    • Expose only necessary ports
    • Use reverse proxy (nginx)
  3. Updates:

    • Keep dependencies updated
    • Regular security audits
    • Dependabot/Renovate integration

7. Implementation Roadmap

Phase 1: Core Scanning (Week 1-2)

  • Basic host discovery (socket-based)
  • SQLite database setup
  • Simple CLI interface
  • Store scan results

Phase 2: Enhanced Scanning (Week 2-3)

  • Integrate python-nmap
  • Service detection
  • Port scanning profiles
  • DNS resolution

Phase 3: Backend API (Week 3-4)

  • FastAPI setup
  • REST endpoints for scans, hosts
  • WebSocket for real-time updates
  • Basic topology inference

Phase 4: Frontend Basics (Week 4-5)

  • React setup with TypeScript
  • Dashboard with host list
  • Scan configuration UI
  • Host detail view

Phase 5: Visualization (Week 5-6)

  • React-flow integration
  • Force-directed layout
  • Interactive node/edge rendering
  • Real-time updates via WebSocket

Phase 6: Polish (Week 6-7)

  • Export functionality (PDF, PNG, JSON)
  • Advanced filters and search
  • Settings and preferences
  • Error handling and validation

Phase 7: Deployment (Week 7-8)

  • Docker containerization
  • Documentation
  • Security hardening
  • Testing and bug fixes

8. Technology Justification

Why Python?

  • Proven: Industry standard for network tools
  • Libraries: Excellent support for network operations
  • Maintainability: Readable, well-documented
  • Community: Large community for troubleshooting

Why FastAPI?

  • Performance: Comparable to Node.js/Go
  • Modern: Async/await support out of the box
  • Type Safety: Leverages Python type hints
  • Documentation: Auto-generated OpenAPI docs

Why React + TypeScript?

  • Maturity: Battle-tested in production
  • TypeScript: Catches errors at compile time
  • Ecosystem: Vast library ecosystem
  • Performance: Virtual DOM, efficient updates

Why react-flow?

  • Purpose-Built: Designed for interactive diagrams
  • Performance: Handles 1000+ nodes smoothly
  • Features: Built-in zoom, pan, minimap, selection
  • Customization: Easy to style and extend

Why SQLite?

  • Simplicity: No separate database server
  • Performance: Fast for this use case
  • Portability: Single file, easy backup
  • Reliability: Well-tested, stable

9. Alternative Architectures Considered

Alternative 1: Electron Desktop App

Pros: Native OS integration, no web server Cons: Larger bundle size, more complex deployment Verdict: Web-based is more flexible

Alternative 2: Go Backend

Pros: Better performance, single binary Cons: Fewer network libraries, steeper learning curve Verdict: Python's ecosystem wins for this use case

Alternative 3: Vue.js Frontend

Pros: Simpler learning curve, good performance Cons: Smaller ecosystem, fewer diagram libraries Verdict: React's ecosystem is more mature

Alternative 4: Cytoscape.js Visualization

Pros: Powerful graph library, many layouts Cons: Steeper learning curve, heavier bundle Verdict: react-flow is more modern and easier

10. Monitoring and Observability

Logging Strategy

import logging
from logging.handlers import RotatingFileHandler

# Structured logging
logger = logging.getLogger("network_scanner")
handler = RotatingFileHandler(
    "scanner.log", 
    maxBytes=10*1024*1024,  # 10MB
    backupCount=5
)
logger.addHandler(handler)

# Log levels:
# INFO: Scan started/completed, hosts discovered
# WARNING: Timeouts, connection errors
# ERROR: Critical failures
# DEBUG: Detailed scanning operations

Metrics to Track

  • Scan duration
  • Hosts discovered per scan
  • Average response time per host
  • Error rates
  • Database size growth

11. Future Enhancements

  1. Advanced Features:

    • Vulnerability scanning (integrate with CVE databases)
    • Network change detection and alerting
    • Historical trend analysis
    • Automated scheduling
  2. Integrations:

    • Import/export to other tools (Nessus, Wireshark)
    • Webhook notifications
    • API for external tools
  3. Visualization:

    • 3D network visualization
    • Heat maps for traffic/activity
    • Time-lapse replay of network changes
  4. Scalability:

    • Support for multiple subnets
    • Distributed scanning with agents
    • PostgreSQL for larger deployments

Quick Start Command Summary

# Install dependencies
pip install fastapi uvicorn python-nmap sqlalchemy pydantic netifaces

# Frontend
npx create-react-app network-scanner --template typescript
npm install reactflow @mui/material axios

# Run development
uvicorn main:app --reload  # Backend
npm start  # Frontend

# Docker deployment
docker-compose up

Document Version: 1.0
Last Updated: December 4, 2025
Author: ArchAgent