Add files via upload
488
README.md
|
|
@ -1,157 +1,425 @@
|
|||
# IMX500-Object-Detection-UI
|
||||
Eine interaktive, didaktische Anwendung für den Raspberry Pi 4/5 mit der Sony IMX500 AI Camera. Diese Software visualisiert Schritt für Schritt wie Objekterkennung funktioniert. Von der Rohdatenerfassung bis zum fertigen Ergebnis.
|
||||
text
|
||||
# IMX500 Object Detection GUI (OOP)
|
||||
|
||||
Die Anwendung bietet zwei Lernniveaus ("Schüler:innen" und "Student:innen") und ist für den Einsatz auf Messen, in Schulen oder Universitäten konzipiert.
|
||||
Eine modulare, objektorientierte Python-Anwendung, die den Sony IMX500 auf dem Raspberry Pi nutzt, um Objektdetektion in einer didaktisch aufbereiteten, schrittweisen Pygame-GUI zu visualisieren. Die Anwendung ist so strukturiert, dass Lesbarkeit, Wartbarkeit und Erweiterbarkeit im Vordergrund stehen (OOP, klare Verantwortlichkeiten, modulare Architektur, Typannotationen, Docstrings, sprechende Namen). [web:410][web:415][web:418]
|
||||
|
||||
## 🚀 Features
|
||||
---
|
||||
|
||||
1. **Live-Objektdetektion:** Nutzt den Hardware-Beschleuniger des IMX500 Sensors.
|
||||
## Überblick
|
||||
|
||||
2. **Zwei Lern-Niveaus:**
|
||||
|
||||
* **Schüler:** 4 vereinfachte Schritte, spielerischer Zugang.
|
||||
|
||||
* **Student:** 7 detaillierte Schritte mit technischer Tiefe (Pre-Processing, Tensoren, NMS).
|
||||
Diese Anwendung demonstriert:
|
||||
|
||||
3. **Interaktiver Workflow:**
|
||||
|
||||
* *Live-Modus:* Echtzeit-Erkennung.
|
||||
|
||||
* *Analyse-Modus:* Einfrieren eines Bildes und schrittweise Durchleuchtung der KI-Pipeline.
|
||||
- Live-Objektdetektion mit dem IMX500-Sensor (über Picamera2). [web:397][web:403]
|
||||
- Einfrieren eines Frames und Analyse in **4 Schritten**:
|
||||
1. Vorverarbeitung (Originalbild + Pixel-Grid, „RGB Pixel“-Badge).
|
||||
2. Threshold-/Binarisierungsansicht.
|
||||
3. Merkmalsextraktion (Kanten/Konturen mit Sobel).
|
||||
4. Lokalisierung (Bounding Boxes + Top‑3 Labels mit Score).
|
||||
- Darstellung in einer minimalen, fullscreen Pygame-GUI. [web:396][web:405]
|
||||
|
||||
4. **Pixel-Inspektor:** In Schritt 1 können einzelne Pixel mit der Maus untersucht werden (RGB-Werte), um das Konzept der "Matrix" zu verdeutlichen.
|
||||
Die komplette Logik ist in Klassen gekapselt, um eine saubere Trennung von **Hardware**, **Datenmodellen**, **Bildtransformationen**, **Rendering** und **Steuerlogik** zu erreichen.
|
||||
|
||||
5. **Gate-Animationen:** Zwischen den Analyseschritten werden animierte Erklärungen (Bildsequenzen) abgespielt.
|
||||
---
|
||||
|
||||
6. **Bilingual & Audio:** Vollständig in Deutsch und Englisch verfügbar, inklusive Sprachausgabe für Erklärtexte.
|
||||
## Projektstruktur
|
||||
|
||||
7. **Didaktische Visualisierung:**
|
||||
|
||||
* Simulation von Auflösungsreduzierung (Pixelation).
|
||||
```text
|
||||
imx500_gui/
|
||||
├─ app.py # Haupt-Controller, Event-Loop, Layout, verbindet alle Komponenten
|
||||
├─ detector.py # IMX500Detector, Det, FrameSnapshot (Hardware + Parsing)
|
||||
├─ steps.py # StepTransformer + STEP_TEXT (fachliche Schritte)
|
||||
├─ ui/
|
||||
│ ├─ __init__.py
|
||||
│ ├─ theme.py # Theme (Farben, Radii, Styles)
|
||||
│ ├─ textlayout.py # TextLayout (Wrap, Font-Fitting)
|
||||
│ └─ renderer.py # Renderer (alle Zeichenoperationen)
|
||||
|
||||
* Visualisierung von Feature-Maps (Sobel-Filter).
|
||||
|
||||
* Darstellung von Bounding Boxes und Confidence Scores.
|
||||
Zentrale Design-Idee:
|
||||
Jede Datei/Klasse hat genau eine Hauptverantwortung (Single Responsibility Principle). [web:410] Der Code lässt sich so leicht lesen, testen und in Unterricht/Präsentationen verwenden.
|
||||
Installation & Voraussetzungen
|
||||
|
||||
## 🛠 Hardware-Voraussetzungen
|
||||
Raspberry Pi mit IMX500-Kamera.
|
||||
|
||||
* **Raspberry Pi 4 oder 5**
|
||||
Python 3.10+.
|
||||
|
||||
* **Betriebssystem:** Raspberry Pi OS **Bookworm (64-bit)** (Desktop-Version empfohlen für GUI).
|
||||
Bibliotheken:
|
||||
|
||||
* **Kamera:** Raspberry Pi AI Camera (Sony IMX500).
|
||||
Picamera2 (inkl. IMX500-Unterstützung). [web:397][web:403]
|
||||
|
||||
* **Display:** Monitor + Maus/Tastatur.
|
||||
Pygame (für GUI und Event-Handling). [web:392][web:405]
|
||||
|
||||
* **Audio:** Lautsprecher oder Kopfhörer (für die Sprachausgabe).
|
||||
Beispiel (vereinfachte Installation, kann je nach System abweichen):
|
||||
|
||||
## 📦 Installation
|
||||
bash
|
||||
sudo apt update
|
||||
sudo apt install python3-picamera2 python3-pygame
|
||||
# IMX500-spezifische Pakete laut Hersteller/Distribution installieren
|
||||
|
||||
1. **Repository klonen / Dateien kopieren:**
|
||||
Ausführen
|
||||
|
||||
```bash
|
||||
git clone https://github.com/watsonove/IMX500-Object-Detection-UI/
|
||||
```
|
||||
Im Projektverzeichnis (wo imx500_gui/ liegt):
|
||||
|
||||
Stelle sicher, dass alle Projektdateien (`app.py`, `detector.py`, `steps.py`, Ordner `ui/` und `assets/`) vorhanden sind.
|
||||
bash
|
||||
python3 -m imx500_gui.app \
|
||||
--model=/usr/share/imx500-models/imx500_network_ssd_mobilenetv2_fpnlite_320x320_pp.rpk
|
||||
|
||||
2. **Abhängigkeiten installieren:**
|
||||
Wichtige Optionen:
|
||||
|
||||
Alle Befehle werden im Terminal ausgeführt.
|
||||
|
||||
Das System benötigt Python 3, `picamera2` (vorinstalliert auf Bookworm) und `pygame`, sowie die IMX500 firmware `imx500`.
|
||||
|
||||
Zuerst sicher gehen, dass der Raspberry PI die aktuelle Software hat:
|
||||
--model: Pfad zum IMX500-Modell (.rpk).
|
||||
|
||||
```bash
|
||||
sudo apt update && sudo apt full-upgrade
|
||||
```
|
||||
--threshold: Konfidenzschwelle für Detections (Standard: 0.55).
|
||||
|
||||
Dann die Abhängigkeiten installieren:
|
||||
--iou: IOU-Schwelle für NMS (Standard: 0.65).
|
||||
|
||||
```bash
|
||||
sudo apt install python3-libcamera python3-kms++ python3-pygame
|
||||
```
|
||||
--max-detections: Maximale Anzahl an Boxen.
|
||||
|
||||
Sowie
|
||||
--cam-width, --cam-height: Auflösung des Kamera-Streams.
|
||||
|
||||
```bash
|
||||
sudo apt install imx500-all
|
||||
```
|
||||
|
||||
Falls numpy fehlt:
|
||||
Beispiel:
|
||||
|
||||
```bash
|
||||
sudo apt install python3-numpy
|
||||
```
|
||||
bash
|
||||
python3 -m imx500_gui.app \
|
||||
--model=/usr/share/imx500-models/imx500_network_ssd_mobilenetv2_fpnlite_320x320_pp.rpk \
|
||||
--threshold=0.55 --iou=0.65 --max-detections=10
|
||||
|
||||
Nachdem du nun die Voraussetzungen installiert hast, starte den Raspberry Pi neu:
|
||||
|
||||
```bash
|
||||
sudo reboot
|
||||
```
|
||||
|
||||
3. **Assets prüfen:**
|
||||
Bedienung der GUI
|
||||
|
||||
Stelle sicher, dass die Ordnerstruktur korrekt ist (siehe unten "Projektstruktur"). Besonders wichtig sind die Bildsequenzen in `assets/schritt_X_experte/`.
|
||||
SPACE
|
||||
|
||||
## ▶️ Starten der Anwendung
|
||||
In LIVE: friert ein Frame ein und wechselt nach ANALYSE, Step 1.
|
||||
|
||||
Starte die Anwendung über das Terminal. Du musst den Pfad zu deiner Model-Datei angeben (z. B. ein MobileNet oder EfficientDet Modell, das für den IMX500 kompiliert ist).
|
||||
In ANALYSE: beendet die Analyse und kehrt zu LIVE zurück.
|
||||
|
||||
```bash
|
||||
ENTER
|
||||
|
||||
python3 app.py --model=/usr/share/imx500-models/imx500_network_ssd_mobilenetv2_fpnlite_320x320_pp.rpk
|
||||
In ANALYSE: zum nächsten Schritt (max. Step 4).
|
||||
|
||||
```
|
||||
BACKSPACE
|
||||
|
||||
## 🎮 Steuerung
|
||||
|
||||
In ANALYSE: zum vorherigen Schritt (min. Step 1).
|
||||
|
||||
Die Anwendung ist für Tastatur- und Mausbedienung optimiert.
|
||||
ESC oder q
|
||||
|
||||
| Taste / Aktion | Funktion |
|
||||
| :----------------- | :---------------------------------------------------------------------- |
|
||||
| **LEERTASTE** | **Freeze / Unfreeze:** Wechselt zwischen Live-Kamera und Analyse-Modus. |
|
||||
| **ENTER** | **Weiter:** Geht zum nächsten Schritt oder bestätigt das "Gate". |
|
||||
| **BACKSPACE** | **Zurück:** Geht zum vorherigen Schritt oder zurück zum Gate. |
|
||||
| **Mausklick** | Bedienung der UI-Buttons (Sprache, Home, Audio, Level-Wahl). |
|
||||
| **Mausbewegung** | Im "Schritt 1" (Analyse): Zeigt RGB-Werte unter dem Mauszeiger an. |
|
||||
| **Q** oder **ESC** | Beendet das Programm. |
|
||||
Anwendung beenden.
|
||||
|
||||
## 📂 Projektstruktur
|
||||
Im LIVE-Modus läuft kontinuierlich die Kamera, Detections werden über dem Videobild als Boxen angezeigt, während im ANALYSE-Modus ein Snapshot in den vier didaktischen Schritten visualisiert wird.
|
||||
Architektur & Klassen
|
||||
detector.py
|
||||
|
||||
imx500_gui/
|
||||
├── app.py # Hauptprogramm (Controller, Event-Loop)\
|
||||
├── detector.py # Hardware-Interface (Kamera, IMX500 Post-Processing)\
|
||||
├── steps.py # Texte und Bild-Transformationen (Logik)\
|
||||
├── README.md # Dokumentation\
|
||||
├── ui/ # UI-Modul (View)\
|
||||
│ ├── __init__.py\
|
||||
│ ├── renderer.py # Zeichenfunktionen (Balken, Overlay, Pixel-Grid)\
|
||||
│ ├── textlayout.py # Textumbruch und -formatierung\
|
||||
│ └── theme.py # Farben und Design-Konstanten\
|
||||
└── assets/ # Medien-Dateien\
|
||||
├── Kanit-Bold.ttf # Schriftart\
|
||||
├── landingpagebg.jpg # Hintergrundbild\
|
||||
├── audio/ # MP3 Sprachdateien (DE & EN)\
|
||||
├── schritt_1_experte/ # Bildsequenz Animation Schritt 1\
|
||||
├── schritt_2_experte/ # Bildsequenz Animation Schritt 2\
|
||||
├── ... # (weitere Ordner bis schritt_7)\
|
||||
└── schritt_7_experte/\
|
||||
Klassen: Det, FrameSnapshot, IMX500Detector
|
||||
|
||||
## 🌍 Sprache & Audio
|
||||
Det
|
||||
Reine Datenklasse für ein einzelnes Objekt (label, conf, box).
|
||||
|
||||
1. Sprachwechsel: Klicke oben rechts auf den Button DE / EN, um die Sprache der Texte und des Audios zu wechseln.
|
||||
FrameSnapshot
|
||||
Kapselt einen eingefrorenen Frame inklusive Metadaten:
|
||||
|
||||
2. Audio-Dateien:
|
||||
|
||||
* Deutsch: schueler_step_X.mp3
|
||||
frame_rgb: letzter RGB‑Frame.
|
||||
|
||||
* Englisch: schueler_step_X_english.mp3
|
||||
src_size: Originalgröße des Frames (Breite, Höhe).
|
||||
|
||||
* Die Dateien müssen im Ordner assets/audio/ liegen.
|
||||
dets: Liste aller Detections.
|
||||
|
||||
## 📝 Lizenz
|
||||
top_dets: Top‑3 Detections (für Box-Overlay).
|
||||
|
||||
This project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details
|
||||
top3: Top‑3 Labels + Scores (für Score-Panel).
|
||||
|
||||
IMX500Detector
|
||||
|
||||
Konfiguration des IMX500 (Netzwerk, Labels, Postprocessing). [web:397]
|
||||
|
||||
Initialisierung und Betrieb von Picamera2 (Preview-Stream).
|
||||
|
||||
capture_snapshot(): Ein Frame + Detections → FrameSnapshot.
|
||||
|
||||
parse_detections(): Postprocessing der Netzwerkoutputs (inkl. NMS, Koordinaten-Konvertierung).
|
||||
|
||||
steps.py
|
||||
|
||||
Klassen/Strukturen: StepInfo, STEP_TEXT, StepTransformer
|
||||
|
||||
StepInfo
|
||||
Titel + Body-Text eines Schritts (für das rechte Panel).
|
||||
|
||||
STEP_TEXT
|
||||
Mapping von Schrittindex (1–4) auf StepInfo, beschreibt in natürlicher Sprache:
|
||||
|
||||
Vorverarbeitung.
|
||||
|
||||
Merkmalsextraktion (Kanten/Konturen).
|
||||
|
||||
Klassifizierung (Scores).
|
||||
|
||||
Lokalisierung (Bounding Boxes).
|
||||
|
||||
StepTransformer
|
||||
|
||||
Implementiert die Bildtransformationen für das linke Video-Panel:
|
||||
|
||||
Step 1: Originalbild (Overlays im Renderer).
|
||||
|
||||
Step 2: Binarisierte, vignettiert verstärkte Grauansicht.
|
||||
|
||||
Step 3: Kanten/Konturen mit Sobel, invertiert.
|
||||
|
||||
Step 4: Originalbild (Overlays im Renderer).
|
||||
|
||||
Die Logik ist vollständig von Pygame entkoppelt und nutzt ausschließlich NumPy-Operationen.
|
||||
ui/theme.py
|
||||
|
||||
Klasse: Theme
|
||||
|
||||
Zentraler Ort für:
|
||||
|
||||
Hintergrundfarbe, Panel-Farben, Linien.
|
||||
|
||||
Text-Farben (normal/muted).
|
||||
|
||||
Akzentfarben (Scores).
|
||||
|
||||
Border-Radius (RADIUS).
|
||||
|
||||
Anpassungen am Look & Feel erfolgen hier, ohne Logikcode zu berühren.
|
||||
|
||||
ui/textlayout.py
|
||||
|
||||
Klasse: TextLayout
|
||||
|
||||
Zuständig für:
|
||||
|
||||
wrap_lines(...): Wortweise Zeilenumbrüche basierend auf font.size() und Panelbreite. [web:254]
|
||||
|
||||
fit_title_and_body(...): Binäre Suche auf Fontgröße, bis Titel und Body in die vorgesehene Fläche passen.
|
||||
|
||||
draw_wrapped_lines(...): Zeichnet umbrochene Texte auf eine Oberfläche (inkl. Zeilenabstand).
|
||||
|
||||
Die Klasse ist bewusst generisch gehalten, so dass sie auch in anderen Pygame-Projekten wiederverwendbar ist. [web:411][web:416]
|
||||
ui/renderer.py
|
||||
|
||||
Klasse: Renderer
|
||||
|
||||
Rendering-Layer für alles, was gezeichnet wird:
|
||||
|
||||
Karten/Panels (abgerundete Rechtecke, Outlines).
|
||||
|
||||
Text-Rendering (Labels, Überschriften).
|
||||
|
||||
Pills (Boxen hinter Labels).
|
||||
|
||||
Pixel-Grid (für Step 1 „RGB Pixel“).
|
||||
|
||||
Step-Indikator (4 Kreise + Fortschrittslinie).
|
||||
|
||||
Score-Balkendiagramm (inkl. Threshold-Linie). [web:405]
|
||||
|
||||
Wichtige Methoden:
|
||||
|
||||
draw_card(...), draw_text(...)
|
||||
|
||||
draw_pixel_grid(...), draw_pill(...)
|
||||
|
||||
rect_in_video_coords(...): rechnet Boxen von Quellgröße auf Video-Panel um.
|
||||
|
||||
draw_step_indicator(...)
|
||||
|
||||
draw_bar_chart(...)
|
||||
|
||||
app.py
|
||||
|
||||
Klasse: App
|
||||
|
||||
Orchestriert alle Komponenten:
|
||||
|
||||
Instanziiert IMX500Detector, StepTransformer, TextLayout, Renderer, Theme.
|
||||
|
||||
Verwaltet den Zustand:
|
||||
|
||||
mode: "LIVE" oder "ANALYSE".
|
||||
|
||||
step: 0–4.
|
||||
|
||||
snapshot: aktueller FrameSnapshot.
|
||||
|
||||
Implementiert den Event-Loop (Pygame):
|
||||
|
||||
handle_events(): Tastatur-Eingaben verarbeiten. [web:392][web:396]
|
||||
|
||||
update(): im LIVE-Modus Snapshot aktualisieren.
|
||||
|
||||
draw(): Layout berechnen und Rendering in drei klar getrennte Bereiche aufteilen:
|
||||
|
||||
_draw_left_view(...)
|
||||
|
||||
_draw_header(...)
|
||||
|
||||
_draw_right_panel(...)
|
||||
|
||||
Ablauf:
|
||||
|
||||
LIVE: kontinuierliches Capturing über IMX500Detector.
|
||||
|
||||
SPACE: Snapshot einfrieren, in ANALYSE wechseln (Step 1).
|
||||
|
||||
ENTER / BACKSPACE: Steps 1–4 durchlaufen (verschiedene Visualisierungen).
|
||||
|
||||
SPACE: zurück zu LIVE.
|
||||
|
||||
UML-Diagramm (Textform)
|
||||
|
||||
Für den Projektbericht oder die Dokumentation kann folgende UML-Klassenübersicht genutzt werden (vereinfachte Darstellung):
|
||||
|
||||
text
|
||||
+----------------------+
|
||||
| IMX500Detector |
|
||||
+----------------------+
|
||||
| - args |
|
||||
| - imx500 |
|
||||
| - intrinsics |
|
||||
| - picam2 |
|
||||
+----------------------+
|
||||
| + capture_snapshot() |
|
||||
| + parse_detections() |
|
||||
| + stop() |
|
||||
+----------------------+
|
||||
|
||||
+----------------------+
|
||||
| Det |
|
||||
+----------------------+
|
||||
| + label: str |
|
||||
| + conf: float |
|
||||
| + box: (x,y,w,h) |
|
||||
+----------------------+
|
||||
|
||||
+------------------------------+
|
||||
| FrameSnapshot |
|
||||
+------------------------------+
|
||||
| + frame_rgb: np.ndarray? |
|
||||
| + src_size: (int,int) |
|
||||
| + dets: List[Det] |
|
||||
| + top3: List[(str,float)] |
|
||||
| + top_dets: List[Det] |
|
||||
+------------------------------+
|
||||
|
||||
+----------------------+
|
||||
| StepTransformer |
|
||||
+----------------------+
|
||||
| + apply(frame,step) |
|
||||
+----------------------+
|
||||
|
||||
+----------------------+
|
||||
| StepInfo |
|
||||
+----------------------+
|
||||
| + title: str |
|
||||
| + body: str |
|
||||
+----------------------+
|
||||
|
||||
+----------------------+
|
||||
| Theme |
|
||||
+----------------------+
|
||||
| + BG |
|
||||
| + PANEL |
|
||||
| + ... |
|
||||
+----------------------+
|
||||
|
||||
+---------------------------+
|
||||
| TextLayout |
|
||||
+---------------------------+
|
||||
| + wrap_lines(...) |
|
||||
| + draw_wrapped_lines(...) |
|
||||
| + fit_title_and_body(...) |
|
||||
+---------------------------+
|
||||
|
||||
+----------------------+
|
||||
| Renderer |
|
||||
+----------------------+
|
||||
| - t: Theme |
|
||||
+----------------------+
|
||||
| + draw_card(...) |
|
||||
| + draw_text(...) |
|
||||
| + draw_pixel_grid() |
|
||||
| + draw_pill(...) |
|
||||
| + rect_in_video_... |
|
||||
| + draw_step_... |
|
||||
| + draw_bar_chart() |
|
||||
+----------------------+
|
||||
|
||||
+------------------------------+
|
||||
| App |
|
||||
+------------------------------+
|
||||
| - args |
|
||||
| - theme: Theme |
|
||||
| - detector: IMX500Detector |
|
||||
| - transformer: StepTransformer|
|
||||
| - text_layout: TextLayout |
|
||||
| - renderer: Renderer |
|
||||
| - mode: str |
|
||||
| - step: int |
|
||||
| - snapshot: FrameSnapshot |
|
||||
| - ... |
|
||||
+------------------------------+
|
||||
| + run() |
|
||||
| - handle_events() |
|
||||
| - update() |
|
||||
| - draw() |
|
||||
| - _draw_left_view(...) |
|
||||
| - _draw_header(...) |
|
||||
| - _draw_right_panel(...) |
|
||||
+------------------------------+
|
||||
|
||||
Beziehungen:
|
||||
- App → IMX500Detector (Komposition)
|
||||
- App → StepTransformer
|
||||
- App → TextLayout
|
||||
- App → Renderer
|
||||
- App verwendet FrameSnapshot und Det als Datenmodelle
|
||||
- Renderer verwendet Theme
|
||||
- StepTransformer nutzt STEP_TEXT indirekt über App (für Beschreibung) und NumPy für Transformationen
|
||||
|
||||
Dieses Diagramm kannst du entweder direkt übernehmen oder in ein Grafiktool (PlantUML, draw.io, etc.) übertragen und dort als Grafik ausgeben.
|
||||
Design-Entscheidungen (Clean Code / OOP)
|
||||
|
||||
Single Responsibility: Jede Klasse hat genau eine Hauptaufgabe (Detector vs. Schritte vs. Rendering vs. Layout). [web:410]
|
||||
|
||||
Abstraktionsebenen trennen:
|
||||
|
||||
Hardware-Zugriff in detector.py.
|
||||
|
||||
Fachliche Verarbeitung der Schritte in steps.py.
|
||||
|
||||
UI-spezifische Darstellung in ui/.
|
||||
|
||||
Steuerlogik in app.py. [web:417]
|
||||
|
||||
Lesbare Struktur:
|
||||
Der Hauptloop in App.run() ist kurz, komplexere Aufgaben sind in kleine, gut benannte Methoden ausgelagert. [web:396][web:405]
|
||||
|
||||
Wiederverwendbarkeit:
|
||||
TextLayout und Renderer sind ausreichend generisch, um in anderen Pygame-Projekten genutzt zu werden. [web:411][web:416]
|
||||
|
||||
Mögliche Erweiterungen
|
||||
|
||||
Demo-Mode ohne Hardware:
|
||||
|
||||
DummyDetector implementieren, der statt der Kamera ein Beispielbild lädt.
|
||||
|
||||
Ideal für Präsentationen oder Entwicklung auf einem Laptop ohne IMX500.
|
||||
|
||||
Konfigurationsdatei:
|
||||
|
||||
Thresholds, Farben, Texte aus einer .toml/.yaml laden, um Code-Anpassungen zu minimieren.
|
||||
|
||||
Testbarkeit:
|
||||
|
||||
Unit-Tests für StepTransformer (Form, Wertebereiche).
|
||||
|
||||
Tests für TextLayout-Funktionen (Zeilenumbrüche, Fontgrößen).
|
||||
|
||||
Logging:
|
||||
|
||||
FPS, Anzahl Detections, aktuelle Step-Nummer per Logging-Modul ausgeben.
|
||||
|
||||
Lizenz und Autorenschaft
|
||||
|
||||
Lizenz (z.B. MIT/Apache‑2.0) in einer separaten LICENSE-Datei definieren.
|
||||
|
||||
In README.md und am Kopf der Python-Dateien können Autor:in, Datum, Projektkontext und Versionsstand dokumentiert werden.
|
||||
|
|
|
|||
6
__init__.py
Normal file
|
|
@ -0,0 +1,6 @@
|
|||
cat > __init__.py << EOF
|
||||
# IMX500 GUI - Hauptpackage
|
||||
|
||||
# Wird von app.py verwendet
|
||||
from .detector import IMX500Detector, FrameSnapshot, Det
|
||||
from .steps import StepTransformer, STEP
|
||||
BIN
__pycache__/detector.cpython-311.pyc
Normal file
BIN
__pycache__/steps.cpython-311.pyc
Normal file
925
app.py
Normal file
|
|
@ -0,0 +1,925 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
import argparse
|
||||
import os
|
||||
from dataclasses import dataclass
|
||||
from typing import Dict, Tuple, List, Optional
|
||||
|
||||
import pygame
|
||||
import pygame.surfarray as surfarray
|
||||
|
||||
from detector import IMX500Detector, FrameSnapshot
|
||||
from steps import StepTransformer, build_step_text, build_gate_text, total_steps_for_level
|
||||
from ui.theme import Theme
|
||||
from ui.textlayout import TextLayout
|
||||
from ui.renderer import Renderer
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class StepInfo:
|
||||
title: str
|
||||
body: str
|
||||
|
||||
|
||||
class App:
|
||||
def __init__(self, args: argparse.Namespace):
|
||||
self.args = args
|
||||
|
||||
# UI services
|
||||
self.theme = Theme()
|
||||
self.text_layout = TextLayout()
|
||||
self.renderer = Renderer(self.theme)
|
||||
|
||||
# Pygame setup
|
||||
pygame.init()
|
||||
pygame.font.init()
|
||||
pygame.mixer.init()
|
||||
|
||||
self.screen = self._set_fullscreen()
|
||||
self.win_w, self.win_h = self.screen.get_size()
|
||||
pygame.display.set_caption("IMX500 GUI")
|
||||
|
||||
# Assets paths
|
||||
self.base_dir = os.path.dirname(os.path.abspath(__file__))
|
||||
self.landing_bg_path = "assets/landingpagebg.jpg"
|
||||
self.font_path = "assets/Kanit-Bold.ttf"
|
||||
self.audio_dir = os.path.join(self.base_dir, "assets", "audio")
|
||||
|
||||
# Background caches
|
||||
self._landing_bg_original = None
|
||||
self._landing_bg_scaled = None
|
||||
self._landing_bg_scaled_size = None
|
||||
self._load_landing_bg()
|
||||
|
||||
# Application State
|
||||
self.running = True
|
||||
self.state = "LIVEINTRO" # LIVEINTRO / LANDING / RUNNING / GATE
|
||||
self.mode = "LIVE" # LIVE / ANALYSE
|
||||
self.step = 0 # 0=LIVE, 1..N=Analyse
|
||||
self.clock = pygame.time.Clock()
|
||||
|
||||
# User Configuration
|
||||
self.lang = "DE"
|
||||
self.level = None # "SCHUELER" / "STUDENT"
|
||||
|
||||
# Logic Components
|
||||
self.detector: IMX500Detector | None = None
|
||||
self.transformer: StepTransformer | None = None
|
||||
self.snapshot: FrameSnapshot | None = None
|
||||
|
||||
# Simulation / Rendering Caches
|
||||
self.sim_surface = None
|
||||
self.sim_step_cached: int | None = None
|
||||
|
||||
# Gate Text Caches
|
||||
self.gate_cached_step: int | None = None
|
||||
self.gate_cached_title_font = None
|
||||
self.gate_cached_body_font = None
|
||||
self.gate_cached_title_lines = None
|
||||
self.gate_cached_body_lines = None
|
||||
|
||||
# Gate Animation
|
||||
self.gate_anim_frames: list[pygame.Surface] | None = None
|
||||
self.gate_anim_key: str | None = None
|
||||
self.gate_anim_idx: int = 0
|
||||
self.gate_anim_last_ms: int = 0
|
||||
self.gate_anim_frame_ms: int = 55
|
||||
|
||||
# Hitboxes (General)
|
||||
self.lang_button_rect = pygame.Rect(0, 0, 0, 0)
|
||||
self.home_button_rect = pygame.Rect(0, 0, 0, 0)
|
||||
self.level_left_rect = pygame.Rect(0, 0, 0, 0)
|
||||
self.level_right_rect = pygame.Rect(0, 0, 0, 0)
|
||||
self.cta_button_rect = pygame.Rect(0, 0, 0, 0)
|
||||
|
||||
# Hitboxes (Gate)
|
||||
self.gate_prev_rect = pygame.Rect(0, 0, 0, 0)
|
||||
self.gate_next_rect = pygame.Rect(0, 0, 0, 0)
|
||||
|
||||
# Hitboxes (Navigation/Audio)
|
||||
self.nav_prev_rect = pygame.Rect(0, 0, 0, 0)
|
||||
self.nav_next_rect = pygame.Rect(0, 0, 0, 0)
|
||||
self.nav_action_rect = pygame.Rect(0, 0, 0, 0)
|
||||
self.nav_audio_rect = pygame.Rect(0, 0, 0, 0)
|
||||
|
||||
# Audio State
|
||||
self.current_audio_file: str | None = None
|
||||
self.audio_is_paused = False
|
||||
|
||||
# Gate Logic
|
||||
self.gate_next_step: int | None = None
|
||||
|
||||
# Text Resources
|
||||
self.UI = {
|
||||
"DE": {
|
||||
"landing_title": "KI OBJEKTERKENNUNG",
|
||||
"landing_sub": "wähle hier dein Niveau",
|
||||
"level_left": "Schüler:in",
|
||||
"level_right": "Student:in",
|
||||
"hint": "ESC/q zum Beenden",
|
||||
"live_hint": "SPACE: Analyse",
|
||||
"quit_hint": "ESC/q: Quit",
|
||||
"analyse_hint": "ENTER next | BACKSPACE prev | SPACE end",
|
||||
"workflow": "Workflow",
|
||||
"workflow_hint": "SPACE friert ein & wechselt zu Analyse.",
|
||||
"lang_label": "DE",
|
||||
"liveintro_cta": "Was erkennt die KI? / What does the AI detect?",
|
||||
"home": "Home",
|
||||
"gate_btn": "Weiter",
|
||||
"back": "Zurück"
|
||||
},
|
||||
"EN": {
|
||||
"landing_title": "AI OBJECT DETECTION",
|
||||
"landing_sub": "choose your level",
|
||||
"level_left": "Pupil",
|
||||
"level_right": "Student",
|
||||
"hint": "ESC/q to quit",
|
||||
"live_hint": "SPACE: Analyse",
|
||||
"quit_hint": "ESC/q: Quit",
|
||||
"analyse_hint": "ENTER next | BACKSPACE prev | SPACE end",
|
||||
"workflow": "Workflow",
|
||||
"workflow_hint": "SPACE freezes & switches to Analyse.",
|
||||
"lang_label": "EN",
|
||||
"liveintro_cta": "Was erkennt die KI? / What does the AI detect?",
|
||||
"home": "Home",
|
||||
"gate_btn": "Continue",
|
||||
"back": "Back"
|
||||
},
|
||||
}
|
||||
|
||||
self._recompute_responsive()
|
||||
|
||||
# ---------------- Responsive System ----------------
|
||||
def _scale(self) -> float:
|
||||
s_w = self.win_w / 1920.0
|
||||
s_h = self.win_h / 1080.0
|
||||
return max(0.6, min(1.6, (s_w + s_h) * 0.5))
|
||||
|
||||
def _fsize(self, px: float) -> int:
|
||||
return max(12, int(px * self._scale()))
|
||||
|
||||
def _load_font(self, size: int) -> pygame.font.Font:
|
||||
try:
|
||||
return pygame.font.Font(self.font_path, size)
|
||||
except Exception:
|
||||
return pygame.font.Font(None, size)
|
||||
|
||||
def _recompute_responsive(self) -> None:
|
||||
self.win_w, self.win_h = self.screen.get_size()
|
||||
|
||||
self.font_ui = self._load_font(self._fsize(22))
|
||||
self.font_small = self._load_font(self._fsize(20))
|
||||
self.font_header = self._load_font(self._fsize(28))
|
||||
self.font_title = self._load_font(self._fsize(38))
|
||||
self.font_landing_title = self._load_font(self._fsize(86))
|
||||
self.font_landing_sub = self._load_font(self._fsize(44))
|
||||
self.font_landing_btn = self._load_font(self._fsize(44))
|
||||
|
||||
self.pad = max(12, int(18 * self._scale()))
|
||||
|
||||
# Button sizes
|
||||
self.lang_btn_w = max(72, int(92 * self._scale()))
|
||||
self.lang_btn_h = max(36, int(44 * self._scale()))
|
||||
self.home_btn_w = max(int(96 * self._scale()), 80)
|
||||
self.home_btn_h = max(int(44 * self._scale()), 34)
|
||||
|
||||
self.level_radius = max(18, int(28 * self._scale()))
|
||||
self.level_border_w = max(2, int(4 * self._scale()))
|
||||
|
||||
self._landing_bg_scaled = None
|
||||
self._landing_bg_scaled_size = None
|
||||
|
||||
# ---------------- Helpers ----------------
|
||||
def _t(self, key: str) -> str:
|
||||
return self.UI[self.lang][key]
|
||||
|
||||
def _total_steps(self) -> int:
|
||||
return total_steps_for_level(self.level or "SCHUELER")
|
||||
|
||||
def _step_map(self) -> Dict[int, object]:
|
||||
assert self.snapshot is not None
|
||||
assert self.level is not None
|
||||
return build_step_text(lang=self.lang, level=self.level, debug=self.snapshot.debug)
|
||||
|
||||
def _gate_map(self) -> Dict[int, object]:
|
||||
assert self.snapshot is not None
|
||||
assert self.level is not None
|
||||
return build_gate_text(lang=self.lang, level=self.level, debug=self.snapshot.debug)
|
||||
|
||||
def _set_fullscreen(self) -> pygame.Surface:
|
||||
info = pygame.display.Info()
|
||||
return pygame.display.set_mode((info.current_w, info.current_h), pygame.FULLSCREEN)
|
||||
|
||||
def _invalidate_caches(self) -> None:
|
||||
self.sim_step_cached = None
|
||||
self.gate_cached_step = None
|
||||
self._stop_audio()
|
||||
|
||||
def _ensure_camera(self) -> None:
|
||||
if self.detector is None:
|
||||
self.detector = IMX500Detector(self.args)
|
||||
if self.snapshot is None:
|
||||
self.snapshot = FrameSnapshot(src_size=(self.args.cam_width, self.args.cam_height))
|
||||
|
||||
def _go_home(self) -> None:
|
||||
self.state = "LIVEINTRO"
|
||||
self.mode = "LIVE"
|
||||
self.step = 0
|
||||
self.level = None
|
||||
self.gate_next_step = None
|
||||
self._invalidate_caches()
|
||||
|
||||
def _is_student(self) -> bool:
|
||||
return self.level == "STUDENT"
|
||||
|
||||
def _load_landing_bg(self) -> None:
|
||||
try:
|
||||
path = os.path.join(self.base_dir, self.landing_bg_path)
|
||||
self._landing_bg_original = pygame.image.load(path).convert()
|
||||
except Exception:
|
||||
self._landing_bg_original = None
|
||||
self._landing_bg_scaled = None
|
||||
self._landing_bg_scaled_size = None
|
||||
|
||||
def _get_landing_bg_scaled(self) -> pygame.Surface | None:
|
||||
if self._landing_bg_original is None:
|
||||
return None
|
||||
size = (self.win_w, self.win_h)
|
||||
if self._landing_bg_scaled is None or self._landing_bg_scaled_size != size:
|
||||
self._landing_bg_scaled = pygame.transform.smoothscale(self._landing_bg_original, size)
|
||||
self._landing_bg_scaled_size = size
|
||||
return self._landing_bg_scaled
|
||||
|
||||
# ---------------- Audio Logic ----------------
|
||||
def _get_audio_filename(self) -> str | None:
|
||||
"""Ermittelt den Dateinamen basierend auf Level und Step."""
|
||||
# STRIKTE REGEL: Audio NUR fuer SCHUELER
|
||||
if self.level != "SCHUELER":
|
||||
return None
|
||||
|
||||
if self.lang == "EN":
|
||||
filename = f"schueler_step_{self.step}_english.mp3"
|
||||
else:
|
||||
filename = f"schueler_step_{self.step}.mp3"
|
||||
|
||||
path = os.path.join(self.audio_dir, filename)
|
||||
if os.path.exists(path):
|
||||
return path
|
||||
return None
|
||||
|
||||
def _toggle_audio(self) -> None:
|
||||
if self.audio_is_paused:
|
||||
pygame.mixer.music.unpause()
|
||||
self.audio_is_paused = False
|
||||
return
|
||||
|
||||
if pygame.mixer.music.get_busy():
|
||||
pygame.mixer.music.pause()
|
||||
self.audio_is_paused = True
|
||||
return
|
||||
|
||||
path = self._get_audio_filename()
|
||||
if path:
|
||||
try:
|
||||
pygame.mixer.music.load(path)
|
||||
pygame.mixer.music.play()
|
||||
self.current_audio_file = path
|
||||
self.audio_is_paused = False
|
||||
except Exception as e:
|
||||
print(f"Audio error: {e}")
|
||||
|
||||
def _stop_audio(self) -> None:
|
||||
pygame.mixer.music.stop()
|
||||
self.audio_is_paused = False
|
||||
self.current_audio_file = None
|
||||
|
||||
# ---------------- Layout Calculation ----------------
|
||||
def make_layout(self) -> Tuple[pygame.Rect, pygame.Rect, pygame.Rect, Optional[pygame.Rect]]:
|
||||
pad = self.pad
|
||||
|
||||
# 1. Panel Decision
|
||||
show_panel = False
|
||||
if self.mode == "ANALYSE":
|
||||
if self.level == "SCHUELER":
|
||||
show_panel = (self.step >= 3)
|
||||
else:
|
||||
show_panel = (self.step >= 6)
|
||||
|
||||
# 2. Vertical Grid
|
||||
title_h = max(60, int(80 * self._scale()))
|
||||
nav_h = max(80, int(100 * self._scale()))
|
||||
available_h = self.win_h - title_h - nav_h - 4 * pad
|
||||
y_title = pad
|
||||
y_content = y_title + title_h + pad
|
||||
y_nav = y_content + available_h + pad
|
||||
|
||||
# 3. Horizontal Grid
|
||||
# Change: Force 1:1 Aspect Ratio (Square) ONLY for Student Steps 2-6
|
||||
target_ar = 16 / 9
|
||||
if self.level == "STUDENT" and (2 <= self.step <= 6):
|
||||
target_ar = 1.0 # 9:9 Aspect Ratio (Square)
|
||||
|
||||
if show_panel:
|
||||
panel_w = max(int(350 * self._scale()), int(self.win_w * 0.28))
|
||||
max_video_w = self.win_w - panel_w - 3 * pad
|
||||
video_w = int(available_h * target_ar)
|
||||
if video_w > max_video_w:
|
||||
video_w = max_video_w
|
||||
|
||||
video_rect = pygame.Rect(pad, y_content, video_w, available_h)
|
||||
panel_x = video_rect.right + pad
|
||||
panel_w_actual = self.win_w - panel_x - pad
|
||||
panel_rect = pygame.Rect(panel_x, y_content, panel_w_actual, available_h)
|
||||
else:
|
||||
panel_rect = None
|
||||
max_video_w = self.win_w - 2 * pad
|
||||
video_w = int(available_h * target_ar)
|
||||
if video_w < max_video_w:
|
||||
x_video = pad + (max_video_w - video_w) // 2
|
||||
else:
|
||||
x_video = pad
|
||||
video_w = max_video_w
|
||||
video_rect = pygame.Rect(x_video, y_content, video_w, available_h)
|
||||
|
||||
title_rect = pygame.Rect(pad, y_title, self.win_w - 2*pad, title_h)
|
||||
nav_rect = pygame.Rect(video_rect.x, y_nav, video_rect.w, nav_h)
|
||||
|
||||
return video_rect, title_rect, nav_rect, panel_rect
|
||||
|
||||
# ---------------- UI Drawing ----------------
|
||||
|
||||
def _draw_title_area(self, rect: pygame.Rect) -> None:
|
||||
if self.mode == "LIVE":
|
||||
text = "Live Workflow"
|
||||
else:
|
||||
step_map = self._step_map()
|
||||
info = step_map.get(self.step)
|
||||
text = info.title if info else f"Schritt {self.step}"
|
||||
self.renderer.draw_text(self.screen, self.font_title, text, (rect.x, rect.centery - self.font_title.get_height()//2), self.theme.TEXT)
|
||||
|
||||
def _draw_step_indicator_global(self) -> None:
|
||||
buttons_w = self.home_btn_w + self.lang_btn_w + 3 * self.pad
|
||||
ind_w = int(self.win_w * 0.25)
|
||||
ind_h = int(26 * self._scale())
|
||||
gap = int(50 * self._scale())
|
||||
x = self.win_w - buttons_w - ind_w - gap
|
||||
title_h = max(60, int(80 * self._scale()))
|
||||
y = self.pad + (title_h - ind_h) // 2
|
||||
|
||||
rect = pygame.Rect(x, y, ind_w, ind_h)
|
||||
self.renderer.draw_step_indicator(self.screen, rect, step=self.step, total_steps=self._total_steps(), font_small=self.font_small)
|
||||
|
||||
def _draw_navigation_area(self, rect: pygame.Rect) -> None:
|
||||
self.nav_prev_rect = pygame.Rect(0,0,0,0)
|
||||
self.nav_next_rect = pygame.Rect(0,0,0,0)
|
||||
self.nav_action_rect = pygame.Rect(0,0,0,0)
|
||||
self.nav_audio_rect = pygame.Rect(0,0,0,0)
|
||||
|
||||
if self.mode == "LIVE":
|
||||
btn_w = min(rect.w, int(600 * self._scale()))
|
||||
btn_h = rect.h
|
||||
btn_x = rect.centerx - btn_w // 2
|
||||
self.nav_action_rect = pygame.Rect(btn_x, rect.y, btn_w, btn_h)
|
||||
lbl = "Analyse starten" if self.lang == "DE" else "Start Analysis"
|
||||
self.renderer.draw_button(self.screen, self.nav_action_rect, lbl, self.font_ui, primary=True)
|
||||
|
||||
else:
|
||||
# Audio Check
|
||||
audio_path = self._get_audio_filename()
|
||||
has_audio = (audio_path is not None)
|
||||
|
||||
gap = int(20 * self._scale())
|
||||
|
||||
if has_audio:
|
||||
btn_w = (rect.w - 2 * gap) // 3
|
||||
btn_h = rect.h
|
||||
|
||||
self.nav_prev_rect = pygame.Rect(rect.x, rect.y, btn_w, btn_h)
|
||||
self.nav_audio_rect = pygame.Rect(rect.x + btn_w + gap, rect.y, btn_w, btn_h)
|
||||
self.nav_next_rect = pygame.Rect(rect.x + 2*btn_w + 2*gap, rect.y, btn_w, btn_h)
|
||||
|
||||
is_playing = pygame.mixer.music.get_busy() and not self.audio_is_paused
|
||||
audio_lbl = "Audio ||" if is_playing else "Audio ▶"
|
||||
|
||||
self.renderer.draw_button(self.screen, self.nav_audio_rect, audio_lbl, self.font_ui, primary=True)
|
||||
|
||||
else:
|
||||
btn_w = (rect.w - gap) // 2
|
||||
btn_h = rect.h
|
||||
self.nav_prev_rect = pygame.Rect(rect.x, rect.y, btn_w, btn_h)
|
||||
self.nav_next_rect = pygame.Rect(rect.x + btn_w + gap, rect.y, btn_w, btn_h)
|
||||
|
||||
lbl_prev = "Zurück" if self.lang == "DE" else "Back"
|
||||
is_last = self.step >= self._total_steps()
|
||||
lbl_next = ("Beenden" if self.lang == "DE" else "Finish") if is_last else ("Weiter" if self.lang == "DE" else "Next")
|
||||
|
||||
self.renderer.draw_button(self.screen, self.nav_prev_rect, lbl_prev, self.font_ui, primary=False)
|
||||
self.renderer.draw_button(self.screen, self.nav_next_rect, lbl_next, self.font_ui, primary=True)
|
||||
|
||||
def _draw_right_panel_content(self, panel_rect: pygame.Rect) -> None:
|
||||
self.renderer.draw_card(self.screen, panel_rect, fill=self.theme.PANEL_2, outline=self.theme.LINE, radius=self.theme.RADIUS)
|
||||
header_h = int(60 * self._scale())
|
||||
self.renderer.draw_text(self.screen, self.font_header, "Ergebnisse", (panel_rect.x + 20, panel_rect.y + 20), self.theme.TEXT)
|
||||
content_rect = pygame.Rect(
|
||||
panel_rect.x + 14,
|
||||
panel_rect.y + header_h,
|
||||
panel_rect.w - 28,
|
||||
panel_rect.h - header_h - 14
|
||||
)
|
||||
chart_font = self._load_font(self._fsize(26))
|
||||
self.renderer.draw_bar_chart(self.screen, content_rect, self.snapshot.top3, self.args.threshold, chart_font, self.font_ui)
|
||||
|
||||
def _draw_home_button(self) -> None:
|
||||
pad = self.pad
|
||||
w, h = self.home_btn_w, self.home_btn_h
|
||||
self.home_button_rect = pygame.Rect(self.win_w - w - pad, pad, w, h)
|
||||
self.renderer.draw_button(self.screen, self.home_button_rect, self._t("home"), self.font_ui, primary=False)
|
||||
|
||||
def _draw_lang_button(self) -> None:
|
||||
w, h = self.lang_btn_w, self.lang_btn_h
|
||||
pad = self.pad
|
||||
x = self.win_w - self.home_btn_w - 2 * pad - w
|
||||
self.lang_button_rect = pygame.Rect(x, pad, w, h)
|
||||
self.renderer.draw_card(self.screen, self.lang_button_rect, fill=self.theme.PANEL_2, outline=self.theme.LINE, radius=max(10, int(12 * self._scale())))
|
||||
pygame.draw.rect(self.screen, self.theme.ACCENT, self.lang_button_rect, width=max(1, int(2 * self._scale())), border_radius=max(10, int(12 * self._scale())))
|
||||
txt = self.font_ui.render(self._t("lang_label"), True, self.theme.TEXT)
|
||||
self.screen.blit(txt, txt.get_rect(center=self.lang_button_rect.center))
|
||||
|
||||
def _toggle_lang(self) -> None:
|
||||
self.lang = "EN" if self.lang == "DE" else "DE"
|
||||
self._invalidate_caches()
|
||||
|
||||
# ---------------- Live Intro / Landing / Gate ----------------
|
||||
def _liveintro_button_layout(self) -> None:
|
||||
btn_w = int(self.win_w * 0.60)
|
||||
btn_w = max(int(520 * self._scale()), min(btn_w, int(1400 * self._scale())))
|
||||
btn_h = max(int(90 * self._scale()), int(self.win_h * 0.11))
|
||||
btn_x = (self.win_w - btn_w) // 2
|
||||
btn_y = int(self.win_h * 0.72)
|
||||
self.cta_button_rect = pygame.Rect(btn_x, btn_y, btn_w, btn_h)
|
||||
|
||||
def _draw_liveintro(self) -> None:
|
||||
self._recompute_responsive()
|
||||
self.screen.fill(self.theme.BG)
|
||||
if self.snapshot is not None and self.snapshot.frame_rgb is not None:
|
||||
surf = surfarray.make_surface(self.snapshot.frame_rgb.swapaxes(0, 1))
|
||||
self.screen.blit(pygame.transform.smoothscale(surf, (self.win_w, self.win_h)), (0, 0))
|
||||
if self.snapshot.dets:
|
||||
for d in self.snapshot.dets:
|
||||
r = self.renderer.rect_in_video_coords(d.box, self.snapshot.src_size, pygame.Rect(0, 0, self.win_w, self.win_h))
|
||||
pygame.draw.rect(self.screen, self.renderer.conf_color(d.conf), r, width=2)
|
||||
self._liveintro_button_layout()
|
||||
self.renderer.draw_button(self.screen, self.cta_button_rect, self._t("liveintro_cta"), self.font_landing_btn, primary=True)
|
||||
pygame.display.flip()
|
||||
self.clock.tick(30)
|
||||
|
||||
def _landing_layout(self) -> None:
|
||||
bw = int(self.win_w * 0.30)
|
||||
bh = int(self.win_h * 0.18)
|
||||
bw = max(int(300 * self._scale()), min(bw, int(720 * self._scale())))
|
||||
bh = max(int(120 * self._scale()), min(bh, int(240 * self._scale())))
|
||||
y = int(self.win_h * 0.62)
|
||||
gap = max(int(self.win_w * 0.06), int(100 * self._scale()))
|
||||
left_x = (self.win_w // 2) - (gap // 2) - bw
|
||||
right_x = (self.win_w // 2) + (gap // 2)
|
||||
self.level_left_rect = pygame.Rect(left_x, y, bw, bh)
|
||||
self.level_right_rect = pygame.Rect(right_x, y, bw, bh)
|
||||
|
||||
def _draw_level_button(self, rect: pygame.Rect, label: str) -> None:
|
||||
self.renderer.draw_button(self.screen, rect, label, self.font_landing_btn, primary=True, border_width=self.level_border_w)
|
||||
|
||||
def _draw_landing(self) -> None:
|
||||
self._recompute_responsive()
|
||||
bg = self._get_landing_bg_scaled()
|
||||
if bg is not None: self.screen.blit(bg, (0, 0))
|
||||
else: self.screen.fill((7, 14, 26))
|
||||
overlay = pygame.Surface((self.win_w, self.win_h), pygame.SRCALPHA)
|
||||
overlay.fill((0, 0, 0, 70))
|
||||
self.screen.blit(overlay, (0, 0))
|
||||
self._landing_layout()
|
||||
col = self.theme.BTN_PRIMARY_BORDER
|
||||
title_surf = self.font_landing_title.render(self._t("landing_title"), True, col)
|
||||
self.screen.blit(title_surf, title_surf.get_rect(center=(self.win_w // 2, int(self.win_h * 0.22))))
|
||||
sub_surf = self.font_landing_sub.render(self._t("landing_sub"), True, col)
|
||||
self.screen.blit(sub_surf, sub_surf.get_rect(center=(self.win_w // 2, int(self.win_h * 0.36))))
|
||||
self._draw_level_button(self.level_left_rect, self._t("level_left"))
|
||||
self._draw_level_button(self.level_right_rect, self._t("level_right"))
|
||||
hint_surf = self.font_ui.render(self._t("hint"), True, col)
|
||||
self.screen.blit(hint_surf, hint_surf.get_rect(center=(self.win_w // 2, int(self.win_h * 0.92))))
|
||||
self._draw_home_button()
|
||||
self._draw_lang_button()
|
||||
pygame.display.flip()
|
||||
self.clock.tick(30)
|
||||
|
||||
def _start_detection(self, level: str) -> None:
|
||||
self.level = level
|
||||
self._ensure_camera()
|
||||
if self.transformer is None: self.transformer = StepTransformer()
|
||||
self.mode = "LIVE"
|
||||
self.step = 0
|
||||
self.gate_next_step = None
|
||||
self._invalidate_caches()
|
||||
self.state = "RUNNING"
|
||||
|
||||
def _gate_layout(self) -> None:
|
||||
btn_w_total = int(self.win_w * 0.35)
|
||||
btn_w_total = max(int(320 * self._scale()), min(btn_w_total, int(800 * self._scale())))
|
||||
btn_h = max(int(90 * self._scale()), int(self.win_h * 0.11))
|
||||
|
||||
y = int(self.win_h * 0.80)
|
||||
gap = int(20 * self._scale())
|
||||
|
||||
# Two buttons side by side
|
||||
single_btn_w = (btn_w_total - gap) // 2
|
||||
|
||||
start_x = (self.win_w - btn_w_total) // 2
|
||||
|
||||
self.gate_prev_rect = pygame.Rect(start_x, y, single_btn_w, btn_h)
|
||||
self.gate_next_rect = pygame.Rect(start_x + single_btn_w + gap, y, single_btn_w, btn_h)
|
||||
|
||||
def _enter_gate_for_step(self, target_step: int) -> None:
|
||||
self.gate_next_step = target_step
|
||||
self.state = "GATE"
|
||||
self._invalidate_caches()
|
||||
|
||||
def _accept_gate(self) -> None:
|
||||
if self.gate_next_step is None: self.state = "RUNNING"; return
|
||||
self.mode = "ANALYSE"
|
||||
self.step = self.gate_next_step
|
||||
self.gate_next_step = None
|
||||
self.state = "RUNNING"
|
||||
self._invalidate_caches()
|
||||
|
||||
def _gate_back(self) -> None:
|
||||
# Wenn wir bei Step 1 sind, geht es zurück zu LIVE
|
||||
if self.gate_next_step == 1:
|
||||
self.mode = "LIVE"
|
||||
self.step = 0
|
||||
self.state = "RUNNING"
|
||||
self.gate_next_step = None
|
||||
self._invalidate_caches()
|
||||
else:
|
||||
# Ansonsten zurück zum vorherigen Analyse-Schritt (ohne Gate)
|
||||
target = (self.gate_next_step or 2) - 1
|
||||
self.mode = "ANALYSE"
|
||||
self.step = target
|
||||
self.state = "RUNNING"
|
||||
self.gate_next_step = None
|
||||
self._invalidate_caches()
|
||||
|
||||
def _load_gate_animation_frames(self, step_n: int) -> None:
|
||||
sequences = {
|
||||
1: dict(folder="schritt_1_experte", prefix="schritt1experte", start=1, end=62),
|
||||
2: dict(folder="schritt_2_experte", prefix="schritt2experte", start=1, end=101),
|
||||
3: dict(folder="schritt_3_experte", prefix="schritt3experte", start=1, end=48),
|
||||
4: dict(folder="schritt_4_experte", prefix="schritt4experte", start=1, end=78),
|
||||
5: dict(folder="schritt_5_experte", prefix="schritt5experte", start=1, end=29),
|
||||
6: dict(folder="schritt_6_experte", prefix="schritt6experte", start=1, end=68),
|
||||
7: dict(folder="schritt_7_experte", prefix="schritt7experte", start=1, end=80),
|
||||
}
|
||||
seq = sequences.get(step_n)
|
||||
if seq is None:
|
||||
self.gate_anim_frames = None; self.gate_anim_key = None; return
|
||||
folder = os.path.join(self.base_dir, "assets", seq["folder"])
|
||||
prefix = seq["prefix"]; start = seq["start"]; end = seq["end"]
|
||||
key = f"{folder}:{prefix}:{start}:{end}"
|
||||
if self.gate_anim_key == key and self.gate_anim_frames is not None: return
|
||||
frames = []
|
||||
for i in range(start, end + 1):
|
||||
path = os.path.join(folder, f"{prefix}{i}.jpg")
|
||||
try:
|
||||
frames.append(pygame.image.load(path).convert())
|
||||
except Exception as e:
|
||||
if i == start:
|
||||
print(f"DEBUG: Konnte {path} nicht laden. Grund: {e}")
|
||||
|
||||
self.gate_anim_frames = frames if frames else None
|
||||
self.gate_anim_key = key
|
||||
self.gate_anim_idx = 0
|
||||
self.gate_anim_last_ms = pygame.time.get_ticks()
|
||||
|
||||
def _draw_gate(self) -> None:
|
||||
self._recompute_responsive()
|
||||
bg = self._get_landing_bg_scaled()
|
||||
if bg is not None: self.screen.blit(bg, (0, 0))
|
||||
else: self.screen.fill((0, 0, 0))
|
||||
overlay = pygame.Surface((self.win_w, self.win_h), pygame.SRCALPHA)
|
||||
overlay.fill((0, 0, 0, 85))
|
||||
self.screen.blit(overlay, (0, 0))
|
||||
self._gate_layout()
|
||||
|
||||
assert self.snapshot is not None
|
||||
gate_map = self._gate_map()
|
||||
step_n = self.gate_next_step or 1
|
||||
info = gate_map.get(step_n)
|
||||
if info is None: info = StepInfo(title=f"Zwischenschritt {step_n}", body="")
|
||||
|
||||
outer = pygame.Rect(int(self.win_w * 0.06), int(self.win_h * 0.10), int(self.win_w * 0.88), int(self.win_h * 0.62))
|
||||
gap = max(12, int(18 * self._scale()))
|
||||
text_w = int(outer.w * (2 / 3)) - gap // 2
|
||||
anim_w = outer.w - text_w - gap
|
||||
text_rect = pygame.Rect(outer.x, outer.y, text_w, outer.h)
|
||||
anim_rect = pygame.Rect(text_rect.right + gap, outer.y, anim_w, outer.h)
|
||||
target_w, target_h = 720, 405
|
||||
anim_frame_rect = pygame.Rect(0, 0, target_w, target_h)
|
||||
anim_frame_rect.center = anim_rect.center
|
||||
if anim_frame_rect.w > anim_rect.w or anim_frame_rect.h > anim_rect.h:
|
||||
scale = min(anim_rect.w / target_w, anim_rect.h / target_h)
|
||||
anim_frame_rect.size = (max(1, int(target_w * scale)), max(1, int(target_h * scale)))
|
||||
anim_frame_rect.center = anim_rect.center
|
||||
|
||||
if self.gate_cached_step != step_n:
|
||||
title, body = self.text_layout.split_title_body(info.title, info.body)
|
||||
self.gate_cached_title_font, self.gate_cached_body_font = self.text_layout.fit_title_and_body(
|
||||
title, body, text_rect, min_body=self._fsize(18), max_body=self._fsize(34),
|
||||
title_ratio=1.25, line_spacing=max(2, int(4 * self._scale())),
|
||||
)
|
||||
self.gate_cached_title_lines = self.text_layout.wrap_lines(title, self.gate_cached_title_font, text_rect.w)
|
||||
self.gate_cached_body_lines = self.text_layout.wrap_lines(body, self.gate_cached_body_font, text_rect.w) if body else []
|
||||
self.gate_cached_step = step_n
|
||||
|
||||
y = text_rect.y
|
||||
y = self.text_layout.draw_wrapped_lines(self.screen, self.gate_cached_title_lines, self.gate_cached_title_font, (235, 235, 235), text_rect.x, y, line_spacing=max(2, int(6 * self._scale())))
|
||||
if self.gate_cached_body_lines:
|
||||
y += self.gate_cached_body_font.get_linesize()
|
||||
self.text_layout.draw_wrapped_lines(self.screen, self.gate_cached_body_lines, self.gate_cached_body_font, (200, 200, 200), text_rect.x, y, line_spacing=max(2, int(6 * self._scale())))
|
||||
|
||||
self._load_gate_animation_frames(step_n)
|
||||
self.renderer.draw_card(self.screen, anim_frame_rect, fill=(10, 10, 10), outline=(60, 60, 60), radius=max(12, int(16 * self._scale())))
|
||||
if self.gate_anim_frames:
|
||||
now = pygame.time.get_ticks()
|
||||
if now - self.gate_anim_last_ms >= self.gate_anim_frame_ms:
|
||||
self.gate_anim_idx = (self.gate_anim_idx + 1) % len(self.gate_anim_frames)
|
||||
self.gate_anim_last_ms = now
|
||||
frame = self.gate_anim_frames[self.gate_anim_idx]
|
||||
new_size = (max(1, int(frame.get_width() * min(anim_frame_rect.w / frame.get_width(), anim_frame_rect.h / frame.get_height()))),
|
||||
max(1, int(frame.get_height() * min(anim_frame_rect.w / frame.get_width(), anim_frame_rect.h / frame.get_height()))))
|
||||
frame_s = pygame.transform.smoothscale(frame, new_size)
|
||||
self.screen.blit(frame_s, frame_s.get_rect(center=anim_frame_rect.center).topleft)
|
||||
|
||||
# Draw Buttons
|
||||
self.renderer.draw_button(self.screen, self.gate_prev_rect, self._t("back"), self.font_landing_btn, primary=False)
|
||||
self.renderer.draw_button(self.screen, self.gate_next_rect, self._t("gate_btn"), self.font_landing_btn, primary=True)
|
||||
|
||||
self._draw_home_button()
|
||||
self._draw_lang_button()
|
||||
pygame.display.flip()
|
||||
self.clock.tick(30)
|
||||
|
||||
# ---------------- Events ----------------
|
||||
def handle_events(self) -> None:
|
||||
for event in pygame.event.get():
|
||||
if event.type == pygame.QUIT: self.running = False
|
||||
elif event.type == pygame.KEYDOWN:
|
||||
if event.key in (pygame.K_q, pygame.K_ESCAPE): self.running = False
|
||||
if self.state == "GATE":
|
||||
if event.key == pygame.K_RETURN: self._accept_gate(); return
|
||||
if event.key == pygame.K_BACKSPACE: self._gate_back(); return
|
||||
if self.state == "RUNNING":
|
||||
if event.key == pygame.K_SPACE:
|
||||
if self.mode == "LIVE":
|
||||
if self._is_student(): self._enter_gate_for_step(1)
|
||||
else: self.mode, self.step = "ANALYSE", 1; self._invalidate_caches()
|
||||
else: self.mode, self.step = "LIVE", 0; self._invalidate_caches()
|
||||
elif self.mode == "ANALYSE":
|
||||
if event.key == pygame.K_RETURN:
|
||||
nxt = min(self._total_steps(), self.step + 1)
|
||||
if nxt != self.step:
|
||||
if self._is_student(): self._enter_gate_for_step(nxt)
|
||||
else: self.step = nxt; self._invalidate_caches()
|
||||
elif event.key == pygame.K_BACKSPACE:
|
||||
if self._is_student():
|
||||
# CHANGE: Im Student-Mode geht BACKSPACE jetzt zum Gate des AKTUELLEN Steps zurück
|
||||
self._enter_gate_for_step(self.step)
|
||||
else:
|
||||
prv = max(1, self.step - 1)
|
||||
if prv != self.step:
|
||||
self.step = prv; self._invalidate_caches()
|
||||
elif event.type == pygame.MOUSEBUTTONDOWN and event.button == 1:
|
||||
if self.state == "LIVEINTRO":
|
||||
if self.cta_button_rect.collidepoint(event.pos): self.state = "LANDING"; return
|
||||
if self.state in ("LANDING", "RUNNING", "GATE"):
|
||||
if self.home_button_rect.collidepoint(event.pos): self._go_home(); return
|
||||
if self.lang_button_rect.collidepoint(event.pos): self._toggle_lang(); return
|
||||
if self.state == "LANDING":
|
||||
if self.level_left_rect.collidepoint(event.pos): self._start_detection("SCHUELER"); return
|
||||
if self.level_right_rect.collidepoint(event.pos): self._start_detection("STUDENT"); return
|
||||
if self.state == "GATE":
|
||||
if self.gate_next_rect.collidepoint(event.pos): self._accept_gate(); return
|
||||
if self.gate_prev_rect.collidepoint(event.pos): self._gate_back(); return
|
||||
if self.state == "RUNNING":
|
||||
# Audio Button Check
|
||||
if self.nav_audio_rect.collidepoint(event.pos):
|
||||
self._toggle_audio()
|
||||
return
|
||||
|
||||
if self.mode == "LIVE":
|
||||
if self.nav_action_rect.collidepoint(event.pos):
|
||||
if self._is_student(): self._enter_gate_for_step(1)
|
||||
else: self.mode, self.step = "ANALYSE", 1; self._invalidate_caches()
|
||||
elif self.mode == "ANALYSE":
|
||||
if self.nav_next_rect.collidepoint(event.pos):
|
||||
if self.step == self._total_steps():
|
||||
self.mode, self.step = "LIVE", 0; self._invalidate_caches()
|
||||
else:
|
||||
nxt = min(self._total_steps(), self.step + 1)
|
||||
if nxt != self.step:
|
||||
if self._is_student(): self._enter_gate_for_step(nxt)
|
||||
else: self.step = nxt; self._invalidate_caches()
|
||||
if self.nav_prev_rect.collidepoint(event.pos):
|
||||
if self._is_student():
|
||||
# CHANGE: Im Student-Mode geht der ZURÜCK-Button jetzt zum Gate des AKTUELLEN Steps zurück
|
||||
self._enter_gate_for_step(self.step)
|
||||
else:
|
||||
prv = max(1, self.step - 1)
|
||||
if prv != self.step:
|
||||
self.step = prv; self._invalidate_caches()
|
||||
|
||||
def update(self) -> None:
|
||||
if self.state == "LIVEINTRO":
|
||||
self._ensure_camera()
|
||||
self.snapshot = self.detector.capture_snapshot()
|
||||
elif self.state == "RUNNING" and self.mode == "LIVE":
|
||||
self.snapshot = self.detector.capture_snapshot()
|
||||
|
||||
# ---------------- Left overlays helpers ----------------
|
||||
def _draw_hud_lines(self, video_rect: pygame.Rect, lines: List[str]) -> None:
|
||||
x, y = video_rect.x + self.pad, video_rect.y + self.pad
|
||||
for line in lines:
|
||||
self.renderer.draw_pill(self.screen, self.font_small, line, (x, y), fg=self.theme.TEXT)
|
||||
y += self.font_small.get_linesize() + 6
|
||||
|
||||
def _draw_roi_overlay(self, video_rect: pygame.Rect) -> None:
|
||||
if not self.snapshot or not self.snapshot.debug.get("roi"): return
|
||||
r = self.renderer.rect_in_video_coords(self.snapshot.debug["roi"], self.snapshot.src_size, video_rect)
|
||||
pygame.draw.rect(self.screen, (80, 160, 255), r, width=2)
|
||||
|
||||
def _draw_det_list(self, video_rect: pygame.Rect, dets, color=None, draw_label=False) -> None:
|
||||
if not self.snapshot: return
|
||||
for d in dets:
|
||||
r = self.renderer.rect_in_video_coords(d.box, self.snapshot.src_size, video_rect)
|
||||
col = color if color else self.renderer.conf_color(d.conf)
|
||||
pygame.draw.rect(self.screen, col, r, width=2)
|
||||
if draw_label:
|
||||
label = f"{d.label} {d.conf*100:.0f}%"
|
||||
txt = self.font_ui.render(label, True, self.theme.TEXT)
|
||||
pill = pygame.Surface((txt.get_width()+16, txt.get_height()+10), pygame.SRCALPHA)
|
||||
pill.fill((15, 16, 20, 180))
|
||||
self.screen.blit(pill, (r.x, max(video_rect.y+6, r.y-txt.get_height()-16)))
|
||||
self.screen.blit(txt, (r.x+8, max(video_rect.y+10, r.y-txt.get_height()-12)))
|
||||
|
||||
def _draw_pixel_inspector(self, video_rect: pygame.Rect) -> None:
|
||||
if self.step != 1: return
|
||||
|
||||
mx, my = pygame.mouse.get_pos()
|
||||
if not video_rect.collidepoint(mx, my): return
|
||||
|
||||
# Hole die Farbe direkt vom Bildschirm (genau das Pixel unter der Maus)
|
||||
try:
|
||||
col = self.screen.get_at((mx, my))
|
||||
except Exception:
|
||||
return
|
||||
|
||||
r, g, b, _ = col
|
||||
|
||||
# Tooltip Text
|
||||
text = f"R:{r} G:{g} B:{b}"
|
||||
font = self.font_ui
|
||||
txt_surf = font.render(text, True, (255, 255, 255)) # Weißer Text
|
||||
|
||||
# Positionierung (etwas versetzt, damit die Maus nicht verdeckt wird)
|
||||
tip_x = mx + 20
|
||||
tip_y = my + 20
|
||||
|
||||
# Am Bildschirmrand? Nach links/oben schieben
|
||||
if tip_x + txt_surf.get_width() > self.win_w:
|
||||
tip_x = mx - txt_surf.get_width() - 15
|
||||
if tip_y + txt_surf.get_height() > self.win_h:
|
||||
tip_y = my - txt_surf.get_height() - 15
|
||||
|
||||
bg_rect = pygame.Rect(tip_x - 6, tip_y - 6, txt_surf.get_width() + 12, txt_surf.get_height() + 12)
|
||||
|
||||
# Zeichnen
|
||||
pygame.draw.rect(self.screen, (20, 20, 20), bg_rect, border_radius=6) # Dunkler Hintergrund
|
||||
pygame.draw.rect(self.screen, (r, g, b), bg_rect, width=2, border_radius=6) # Rahmen in Pixelfarbe
|
||||
self.screen.blit(txt_surf, (tip_x, tip_y))
|
||||
|
||||
# ---------------- Draw Main ----------------
|
||||
def draw(self) -> None:
|
||||
if self.state == "LIVEINTRO": self._draw_liveintro(); return
|
||||
if self.state == "LANDING": self._draw_landing(); return
|
||||
if self.state == "GATE": self._draw_gate(); return
|
||||
|
||||
self._recompute_responsive()
|
||||
assert self.snapshot is not None
|
||||
|
||||
video_rect, title_rect, nav_rect, panel_rect = self.make_layout()
|
||||
|
||||
self.screen.fill(self.theme.BG)
|
||||
|
||||
# 1. Video
|
||||
self.renderer.draw_card(self.screen, video_rect, fill=(12, 13, 16), outline=self.theme.LINE, radius=self.theme.RADIUS)
|
||||
self._draw_left_view(video_rect)
|
||||
self._draw_pixel_inspector(video_rect)
|
||||
|
||||
# 2. Title
|
||||
self._draw_title_area(title_rect)
|
||||
|
||||
# 3. Nav Buttons
|
||||
self._draw_navigation_area(nav_rect)
|
||||
|
||||
# 4. Optional Panel (Scores)
|
||||
if panel_rect:
|
||||
self._draw_right_panel_content(panel_rect)
|
||||
|
||||
# 5. Global Elements
|
||||
self._draw_step_indicator_global()
|
||||
self._draw_home_button()
|
||||
self._draw_lang_button()
|
||||
|
||||
pygame.display.flip()
|
||||
self.clock.tick(30)
|
||||
|
||||
def _draw_left_view(self, video_rect: pygame.Rect) -> None:
|
||||
if self.snapshot.frame_rgb is None: return
|
||||
if self.mode == "LIVE":
|
||||
surf = surfarray.make_surface(self.snapshot.frame_rgb.swapaxes(0, 1))
|
||||
self.screen.blit(pygame.transform.smoothscale(surf, (video_rect.w, video_rect.h)), video_rect.topleft)
|
||||
for d in self.snapshot.dets:
|
||||
r = self.renderer.rect_in_video_coords(d.box, self.snapshot.src_size, video_rect)
|
||||
pygame.draw.rect(self.screen, self.renderer.conf_color(d.conf), r, width=2)
|
||||
self._draw_hud_lines(video_rect, [f"Stream: {self.snapshot.debug.get('src_size', (0,0))}"])
|
||||
else:
|
||||
if self.sim_step_cached != self.step:
|
||||
sim_rgb = self.transformer.apply(self.snapshot.frame_rgb, self.step, self.level)
|
||||
self.sim_surface = surfarray.make_surface(sim_rgb.swapaxes(0, 1))
|
||||
self.sim_step_cached = self.step
|
||||
if self.sim_surface:
|
||||
self.screen.blit(pygame.transform.smoothscale(self.sim_surface, (video_rect.w, video_rect.h)), video_rect.topleft)
|
||||
|
||||
dbg = self.snapshot.debug
|
||||
if self.level == "SCHUELER":
|
||||
if self.step == 1:
|
||||
self.renderer.draw_pixel_grid(self.screen, video_rect, spacing=max(18, int(26*self._scale())))
|
||||
self.renderer.draw_pill(self.screen, self.font_ui, "RGB Pixel", (video_rect.x+self.pad, video_rect.y+self.pad), fg=self.theme.TEXT)
|
||||
if self.step == 3:
|
||||
self._draw_det_list(video_rect, self.snapshot.raw_dets, color=(120, 120, 120))
|
||||
self._draw_det_list(video_rect, self.snapshot.dets[:10])
|
||||
if self.step == 4 and self.snapshot.top_dets: self._draw_det_list(video_rect, self.snapshot.top_dets, draw_label=True)
|
||||
else:
|
||||
if self.step == 1: self._draw_hud_lines(video_rect, ["Step1 Capture"])
|
||||
elif self.step == 2:
|
||||
self._draw_roi_overlay(video_rect)
|
||||
self._draw_hud_lines(video_rect, ["Step2 ROI/Aspect", f"ROI: {dbg.get('roi')}"])
|
||||
elif self.step == 3: self._draw_hud_lines(video_rect, ["Step3 Inference (IMX500)"])
|
||||
elif self.step == 4:
|
||||
lines = ["Step4 Read tensors"]
|
||||
if dbg.get("output_shapes"): lines.extend([f"out{i}: {s}" for i, s in enumerate(dbg["output_shapes"][:2])])
|
||||
self._draw_hud_lines(video_rect, lines)
|
||||
elif self.step == 5:
|
||||
self._draw_det_list(video_rect, self.snapshot.raw_dets, color=(255, 220, 80))
|
||||
self._draw_hud_lines(video_rect, ["Step5 Parse candidates"])
|
||||
elif self.step == 6:
|
||||
self._draw_det_list(video_rect, self.snapshot.raw_dets, color=(120, 120, 120))
|
||||
self._draw_det_list(video_rect, self.snapshot.dets[:10])
|
||||
self._draw_hud_lines(video_rect, ["Step6 Filter/Rank"])
|
||||
elif self.step == 7:
|
||||
self._draw_det_list(video_rect, self.snapshot.top_dets, draw_label=True)
|
||||
self._draw_hud_lines(video_rect, ["Step7 Render"])
|
||||
|
||||
# ---------------- Main Loop ----------------
|
||||
def run(self) -> None:
|
||||
try:
|
||||
while self.running:
|
||||
self.handle_events()
|
||||
self.update()
|
||||
self.draw()
|
||||
finally:
|
||||
pygame.quit()
|
||||
if self.detector: self.detector.stop()
|
||||
|
||||
|
||||
def get_args() -> argparse.Namespace:
|
||||
p = argparse.ArgumentParser()
|
||||
p.add_argument("--model", required=True)
|
||||
p.add_argument("--threshold", type=float, default=0.55)
|
||||
p.add_argument("--iou", type=float, default=0.65)
|
||||
p.add_argument("--max-detections", type=int, default=10)
|
||||
p.add_argument("--bbox-normalization", action=argparse.BooleanOptionalAction, default=None)
|
||||
p.add_argument("--bbox-order", choices=["yx", "xy"], default="yx")
|
||||
p.add_argument("--postprocess", choices=["", "nanodet"], default=None)
|
||||
p.add_argument("-r", "--preserve-aspect-ratio", action=argparse.BooleanOptionalAction, default=None)
|
||||
p.add_argument("--labels", type=str, default=None)
|
||||
p.add_argument("--cam-width", type=int, default=1280)
|
||||
p.add_argument("--cam-height", type=int, default=720)
|
||||
return p.parse_args()
|
||||
|
||||
|
||||
def main() -> None:
|
||||
args = get_args()
|
||||
App(args).run()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
BIN
audio/schueler_step_1.mp3
Normal file
BIN
audio/schueler_step_1_english.mp3
Normal file
BIN
audio/schueler_step_2.mp3
Normal file
BIN
audio/schueler_step_2_english.mp3
Normal file
BIN
audio/schueler_step_3.mp3
Normal file
BIN
audio/schueler_step_3_english.mp3
Normal file
BIN
audio/schueler_step_4.mp3
Normal file
BIN
audio/schueler_step_4_english.mp3
Normal file
299
detector.py
Normal file
|
|
@ -0,0 +1,299 @@
|
|||
#!/usr/bin/env python3
|
||||
# imx500_gui/detector.py
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from functools import lru_cache
|
||||
from typing import List, Tuple, Optional, Dict, Any
|
||||
import sys
|
||||
|
||||
import numpy as np
|
||||
|
||||
from picamera2 import Picamera2
|
||||
from picamera2.devices.imx500 import IMX500, NetworkIntrinsics, postprocess_nanodet_detection # type: ignore
|
||||
|
||||
|
||||
RAW_TOPK = 20 # raw candidates to show in Student Step 5/6
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class Det:
|
||||
label: str
|
||||
conf: float
|
||||
box: Tuple[int, int, int, int] # (x, y, w, h) in stream coords
|
||||
|
||||
|
||||
@dataclass
|
||||
class FrameSnapshot:
|
||||
frame_rgb: Optional[np.ndarray] = None
|
||||
src_size: Tuple[int, int] = (1280, 720) # (w, h)
|
||||
|
||||
dets: List[Det] = None # filtered detections (>= threshold), sorted desc
|
||||
raw_dets: List[Det] = None # raw Top-K before threshold, sorted desc
|
||||
|
||||
top3: List[Tuple[str, float]] = None
|
||||
top_dets: List[Det] = None
|
||||
|
||||
debug: Dict[str, Any] = None
|
||||
|
||||
def __post_init__(self) -> None:
|
||||
self.dets = self.dets or []
|
||||
self.raw_dets = self.raw_dets or []
|
||||
self.top3 = self.top3 or []
|
||||
self.top_dets = self.top_dets or []
|
||||
self.debug = self.debug or {}
|
||||
|
||||
|
||||
class IMX500Detector:
|
||||
def __init__(self, args):
|
||||
self.args = args
|
||||
self.imx500 = IMX500(args.model)
|
||||
self.intrinsics: NetworkIntrinsics = self.imx500.network_intrinsics
|
||||
|
||||
self._init_intrinsics()
|
||||
self.picam2 = self._init_camera()
|
||||
|
||||
def _init_intrinsics(self) -> None:
|
||||
intr = self.intrinsics
|
||||
if not intr:
|
||||
intr = NetworkIntrinsics()
|
||||
intr.task = "object detection"
|
||||
self.intrinsics = intr
|
||||
elif intr.task != "object detection":
|
||||
print("Network is not an object detection task.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
if getattr(self.args, "labels", None):
|
||||
with open(self.args.labels, "r", encoding="utf-8") as f:
|
||||
intr.labels = f.read().splitlines()
|
||||
|
||||
if getattr(self.args, "bbox_normalization", None) is not None:
|
||||
intr.bbox_normalization = self.args.bbox_normalization
|
||||
if getattr(self.args, "bbox_order", None) is not None:
|
||||
intr.bbox_order = self.args.bbox_order
|
||||
if getattr(self.args, "preserve_aspect_ratio", None) is not None:
|
||||
intr.preserve_aspect_ratio = self.args.preserve_aspect_ratio
|
||||
|
||||
if getattr(self.args, "postprocess", None) is not None:
|
||||
pp = self.args.postprocess if self.args.postprocess != "" else ""
|
||||
intr.postprocess = pp
|
||||
|
||||
if intr.labels is None:
|
||||
intr.labels = [f"Class {i}" for i in range(1000)]
|
||||
|
||||
intr.update_with_defaults()
|
||||
|
||||
def _init_camera(self) -> Picamera2:
|
||||
picam2 = Picamera2(self.imx500.camera_num)
|
||||
|
||||
config = picam2.create_preview_configuration(
|
||||
main={"size": (self.args.cam_width, self.args.cam_height), "format": "RGB888"},
|
||||
controls={"FrameRate": self.intrinsics.inference_rate},
|
||||
buffer_count=6,
|
||||
)
|
||||
|
||||
self.imx500.show_network_fw_progress_bar()
|
||||
picam2.configure(config)
|
||||
picam2.start()
|
||||
|
||||
if self.intrinsics.preserve_aspect_ratio:
|
||||
self.imx500.set_auto_aspect_ratio()
|
||||
|
||||
return picam2
|
||||
|
||||
@lru_cache(maxsize=1)
|
||||
def get_labels(self) -> List[str]:
|
||||
labels = self.intrinsics.labels or []
|
||||
if getattr(self.intrinsics, "ignore_dash_labels", False):
|
||||
labels = [label for label in labels if label and label != "-"]
|
||||
return labels
|
||||
|
||||
def _apply_bbox_normalization_and_order(
|
||||
self, boxes: np.ndarray, input_w: int, input_h: int
|
||||
) -> np.ndarray:
|
||||
bbox_norm = bool(self.intrinsics.bbox_normalization) if self.intrinsics.bbox_normalization is not None else False
|
||||
bbox_order = self.intrinsics.bbox_order or "yx"
|
||||
|
||||
out = np.array(boxes, dtype=np.float32, copy=True)
|
||||
|
||||
if bbox_norm:
|
||||
if bbox_order == "yx":
|
||||
out[:, 0] *= input_h
|
||||
out[:, 1] *= input_w
|
||||
out[:, 2] *= input_h
|
||||
out[:, 3] *= input_w
|
||||
else:
|
||||
out[:, 0] *= input_w
|
||||
out[:, 1] *= input_h
|
||||
out[:, 2] *= input_w
|
||||
out[:, 3] *= input_h
|
||||
|
||||
return out
|
||||
|
||||
def _nanodet_xywh_center_to_xyxy(self, boxes_xywh_c: np.ndarray) -> np.ndarray:
|
||||
x_c = boxes_xywh_c[:, 0]
|
||||
y_c = boxes_xywh_c[:, 1]
|
||||
w = boxes_xywh_c[:, 2]
|
||||
h = boxes_xywh_c[:, 3]
|
||||
x0 = x_c - w / 2.0
|
||||
y0 = y_c - h / 2.0
|
||||
x1 = x_c + w / 2.0
|
||||
y1 = y_c + h / 2.0
|
||||
return np.stack([x0, y0, x1, y1], axis=1).astype(np.float32)
|
||||
|
||||
def _map_to_int_xywh(self, mapped: Tuple[float, float, float, float]) -> Tuple[int, int, int, int]:
|
||||
x, y, w, h = mapped
|
||||
return int(x), int(y), int(w), int(h)
|
||||
|
||||
def _safe_get_roi_and_scalercrop(
|
||||
self, metadata: Dict[str, Any]
|
||||
) -> Tuple[Optional[Tuple[int, int, int, int]], Optional[Tuple[int, int, int, int]]]:
|
||||
roi = None
|
||||
sc = None
|
||||
|
||||
try:
|
||||
b = self.imx500.get_roi_scaled(metadata)
|
||||
if isinstance(b, tuple) and len(b) == 4:
|
||||
roi = tuple(int(v) for v in b)
|
||||
else:
|
||||
roi = (int(b.x), int(b.y), int(b.width), int(b.height))
|
||||
except Exception:
|
||||
roi = None
|
||||
|
||||
for k in ("ScalerCrop", "scaler_crop", "scalerCrop"):
|
||||
if k in metadata:
|
||||
v = metadata.get(k)
|
||||
try:
|
||||
if isinstance(v, (list, tuple)) and len(v) == 4:
|
||||
sc = tuple(int(x) for x in v)
|
||||
except Exception:
|
||||
pass
|
||||
break
|
||||
|
||||
return roi, sc
|
||||
|
||||
def parse_detections(
|
||||
self, metadata: Dict[str, Any]
|
||||
) -> Tuple[List[Det], List[Det], List[Tuple[str, float]], Dict[str, Any]]:
|
||||
threshold = float(self.args.threshold)
|
||||
iou = float(self.args.iou)
|
||||
max_detections = int(self.args.max_detections)
|
||||
|
||||
np_outputs = self.imx500.get_outputs(metadata, add_batch=True)
|
||||
input_w, input_h = self.imx500.get_input_size()
|
||||
|
||||
debug: Dict[str, Any] = {
|
||||
"threshold": threshold,
|
||||
"iou": iou,
|
||||
"max_detections": max_detections,
|
||||
"raw_topk": RAW_TOPK,
|
||||
"input_size": (input_w, input_h),
|
||||
"bbox_order": self.intrinsics.bbox_order,
|
||||
"bbox_normalization": self.intrinsics.bbox_normalization,
|
||||
"preserve_aspect_ratio": self.intrinsics.preserve_aspect_ratio,
|
||||
"postprocess": self.intrinsics.postprocess,
|
||||
"network_name": getattr(self.intrinsics, "network_name", None),
|
||||
}
|
||||
|
||||
try:
|
||||
shapes = self.imx500.get_output_shapes(metadata)
|
||||
debug["output_shapes"] = [tuple(int(x) for x in s) for s in shapes] if shapes else None
|
||||
except Exception:
|
||||
debug["output_shapes"] = None
|
||||
|
||||
roi, sc = self._safe_get_roi_and_scalercrop(metadata)
|
||||
debug["roi"] = roi
|
||||
debug["scaler_crop"] = sc
|
||||
|
||||
if np_outputs is None:
|
||||
debug["raw_candidates"] = 0
|
||||
debug["kept"] = 0
|
||||
return [], [], [], debug
|
||||
|
||||
labels = self.get_labels()
|
||||
|
||||
raw_dets: List[Det] = []
|
||||
kept_dets: List[Det] = []
|
||||
|
||||
if self.intrinsics.postprocess == "nanodet":
|
||||
boxes, scores, classes = postprocess_nanodet_detection(
|
||||
outputs=np_outputs[0],
|
||||
conf=0.0,
|
||||
iou_thres=iou,
|
||||
max_out_dets=max(max_detections, RAW_TOPK),
|
||||
)
|
||||
|
||||
boxes = np.asarray(boxes)
|
||||
scores = np.asarray(scores)
|
||||
classes = np.asarray(classes)
|
||||
debug["raw_candidates"] = int(len(scores))
|
||||
|
||||
boxes_xyxy = self._nanodet_xywh_center_to_xyxy(np.asarray(boxes, dtype=np.float32))
|
||||
|
||||
for box_xyxy, score, category in zip(boxes_xyxy, scores, classes):
|
||||
conf = float(score)
|
||||
cat = int(category)
|
||||
name = labels[cat] if 0 <= cat < len(labels) else f"Class {cat}"
|
||||
|
||||
mapped = self.imx500.convert_inference_coords(tuple(box_xyxy), metadata, self.picam2)
|
||||
det = Det(label=name, conf=conf, box=self._map_to_int_xywh(mapped))
|
||||
raw_dets.append(det)
|
||||
if conf >= threshold:
|
||||
kept_dets.append(det)
|
||||
|
||||
else:
|
||||
boxes = np.asarray(np_outputs[0][0], dtype=np.float32)
|
||||
scores = np.asarray(np_outputs[1][0], dtype=np.float32)
|
||||
classes = np.asarray(np_outputs[2][0], dtype=np.float32)
|
||||
debug["raw_candidates"] = int(len(scores))
|
||||
|
||||
boxes = self._apply_bbox_normalization_and_order(boxes, input_w=input_w, input_h=input_h)
|
||||
|
||||
for box, score, category in zip(boxes, scores, classes):
|
||||
conf = float(score)
|
||||
cat = int(category)
|
||||
name = labels[cat] if 0 <= cat < len(labels) else f"Class {cat}"
|
||||
|
||||
mapped = self.imx500.convert_inference_coords(tuple(box), metadata, self.picam2)
|
||||
det = Det(label=name, conf=conf, box=self._map_to_int_xywh(mapped))
|
||||
raw_dets.append(det)
|
||||
if conf >= threshold:
|
||||
kept_dets.append(det)
|
||||
|
||||
raw_sorted = sorted(raw_dets, key=lambda d: d.conf, reverse=True)
|
||||
kept_sorted = sorted(kept_dets, key=lambda d: d.conf, reverse=True)
|
||||
|
||||
raw_topk = raw_sorted[:RAW_TOPK]
|
||||
top3 = [(d.label, d.conf) for d in kept_sorted[:3]]
|
||||
|
||||
debug["kept"] = int(len(kept_sorted))
|
||||
|
||||
return kept_sorted, raw_topk, top3, debug
|
||||
|
||||
def capture_snapshot(self) -> FrameSnapshot:
|
||||
request = self.picam2.capture_request()
|
||||
try:
|
||||
metadata = request.get_metadata()
|
||||
frame = request.make_array("main")
|
||||
frame = frame[..., ::-1].copy()
|
||||
|
||||
src_h, src_w = frame.shape[:2]
|
||||
|
||||
dets, raw_topk, top3, debug = self.parse_detections(metadata)
|
||||
debug["src_size"] = (src_w, src_h)
|
||||
|
||||
return FrameSnapshot(
|
||||
frame_rgb=frame,
|
||||
src_size=(src_w, src_h),
|
||||
dets=dets,
|
||||
raw_dets=raw_topk,
|
||||
top3=top3,
|
||||
top_dets=dets[:3],
|
||||
debug=debug,
|
||||
)
|
||||
finally:
|
||||
request.release()
|
||||
|
||||
def stop(self) -> None:
|
||||
self.picam2.stop()
|
||||
BIN
schritt_1_experte/schritt1experte1.jpg
Normal file
|
After Width: | Height: | Size: 185 KiB |
BIN
schritt_1_experte/schritt1experte10.jpg
Normal file
|
After Width: | Height: | Size: 145 KiB |
BIN
schritt_1_experte/schritt1experte11.jpg
Normal file
|
After Width: | Height: | Size: 140 KiB |
BIN
schritt_1_experte/schritt1experte12.jpg
Normal file
|
After Width: | Height: | Size: 134 KiB |
BIN
schritt_1_experte/schritt1experte13.jpg
Normal file
|
After Width: | Height: | Size: 129 KiB |
BIN
schritt_1_experte/schritt1experte14.jpg
Normal file
|
After Width: | Height: | Size: 126 KiB |
BIN
schritt_1_experte/schritt1experte15.jpg
Normal file
|
After Width: | Height: | Size: 123 KiB |
BIN
schritt_1_experte/schritt1experte16.jpg
Normal file
|
After Width: | Height: | Size: 121 KiB |
BIN
schritt_1_experte/schritt1experte17.jpg
Normal file
|
After Width: | Height: | Size: 120 KiB |
BIN
schritt_1_experte/schritt1experte18.jpg
Normal file
|
After Width: | Height: | Size: 120 KiB |
BIN
schritt_1_experte/schritt1experte19.jpg
Normal file
|
After Width: | Height: | Size: 121 KiB |
BIN
schritt_1_experte/schritt1experte2.jpg
Normal file
|
After Width: | Height: | Size: 184 KiB |
BIN
schritt_1_experte/schritt1experte20.jpg
Normal file
|
After Width: | Height: | Size: 122 KiB |
BIN
schritt_1_experte/schritt1experte21.jpg
Normal file
|
After Width: | Height: | Size: 122 KiB |
BIN
schritt_1_experte/schritt1experte22.jpg
Normal file
|
After Width: | Height: | Size: 123 KiB |
BIN
schritt_1_experte/schritt1experte23.jpg
Normal file
|
After Width: | Height: | Size: 124 KiB |
BIN
schritt_1_experte/schritt1experte24.jpg
Normal file
|
After Width: | Height: | Size: 124 KiB |
BIN
schritt_1_experte/schritt1experte25.jpg
Normal file
|
After Width: | Height: | Size: 126 KiB |
BIN
schritt_1_experte/schritt1experte26.jpg
Normal file
|
After Width: | Height: | Size: 128 KiB |
BIN
schritt_1_experte/schritt1experte27.jpg
Normal file
|
After Width: | Height: | Size: 129 KiB |
BIN
schritt_1_experte/schritt1experte28.jpg
Normal file
|
After Width: | Height: | Size: 131 KiB |
BIN
schritt_1_experte/schritt1experte29.jpg
Normal file
|
After Width: | Height: | Size: 133 KiB |
BIN
schritt_1_experte/schritt1experte3.jpg
Normal file
|
After Width: | Height: | Size: 183 KiB |
BIN
schritt_1_experte/schritt1experte30.jpg
Normal file
|
After Width: | Height: | Size: 134 KiB |
BIN
schritt_1_experte/schritt1experte31.jpg
Normal file
|
After Width: | Height: | Size: 136 KiB |
BIN
schritt_1_experte/schritt1experte32.jpg
Normal file
|
After Width: | Height: | Size: 138 KiB |
BIN
schritt_1_experte/schritt1experte33.jpg
Normal file
|
After Width: | Height: | Size: 140 KiB |
BIN
schritt_1_experte/schritt1experte34.jpg
Normal file
|
After Width: | Height: | Size: 142 KiB |
BIN
schritt_1_experte/schritt1experte35.jpg
Normal file
|
After Width: | Height: | Size: 144 KiB |
BIN
schritt_1_experte/schritt1experte36.jpg
Normal file
|
After Width: | Height: | Size: 144 KiB |
BIN
schritt_1_experte/schritt1experte37.jpg
Normal file
|
After Width: | Height: | Size: 145 KiB |
BIN
schritt_1_experte/schritt1experte38.jpg
Normal file
|
After Width: | Height: | Size: 147 KiB |
BIN
schritt_1_experte/schritt1experte39.jpg
Normal file
|
After Width: | Height: | Size: 148 KiB |
BIN
schritt_1_experte/schritt1experte4.jpg
Normal file
|
After Width: | Height: | Size: 181 KiB |
BIN
schritt_1_experte/schritt1experte40.jpg
Normal file
|
After Width: | Height: | Size: 148 KiB |
BIN
schritt_1_experte/schritt1experte41.jpg
Normal file
|
After Width: | Height: | Size: 150 KiB |
BIN
schritt_1_experte/schritt1experte42.jpg
Normal file
|
After Width: | Height: | Size: 151 KiB |
BIN
schritt_1_experte/schritt1experte43.jpg
Normal file
|
After Width: | Height: | Size: 151 KiB |
BIN
schritt_1_experte/schritt1experte44.jpg
Normal file
|
After Width: | Height: | Size: 153 KiB |
BIN
schritt_1_experte/schritt1experte45.jpg
Normal file
|
After Width: | Height: | Size: 154 KiB |
BIN
schritt_1_experte/schritt1experte46.jpg
Normal file
|
After Width: | Height: | Size: 155 KiB |
BIN
schritt_1_experte/schritt1experte47.jpg
Normal file
|
After Width: | Height: | Size: 156 KiB |
BIN
schritt_1_experte/schritt1experte48.jpg
Normal file
|
After Width: | Height: | Size: 156 KiB |
BIN
schritt_1_experte/schritt1experte49.jpg
Normal file
|
After Width: | Height: | Size: 157 KiB |
BIN
schritt_1_experte/schritt1experte5.jpg
Normal file
|
After Width: | Height: | Size: 178 KiB |
BIN
schritt_1_experte/schritt1experte50.jpg
Normal file
|
After Width: | Height: | Size: 158 KiB |
BIN
schritt_1_experte/schritt1experte51.jpg
Normal file
|
After Width: | Height: | Size: 159 KiB |
BIN
schritt_1_experte/schritt1experte52.jpg
Normal file
|
After Width: | Height: | Size: 158 KiB |
BIN
schritt_1_experte/schritt1experte53.jpg
Normal file
|
After Width: | Height: | Size: 158 KiB |
BIN
schritt_1_experte/schritt1experte54.jpg
Normal file
|
After Width: | Height: | Size: 159 KiB |
BIN
schritt_1_experte/schritt1experte55.jpg
Normal file
|
After Width: | Height: | Size: 159 KiB |
BIN
schritt_1_experte/schritt1experte56.jpg
Normal file
|
After Width: | Height: | Size: 158 KiB |
BIN
schritt_1_experte/schritt1experte57.jpg
Normal file
|
After Width: | Height: | Size: 159 KiB |
BIN
schritt_1_experte/schritt1experte58.jpg
Normal file
|
After Width: | Height: | Size: 159 KiB |
BIN
schritt_1_experte/schritt1experte59.jpg
Normal file
|
After Width: | Height: | Size: 159 KiB |
BIN
schritt_1_experte/schritt1experte6.jpg
Normal file
|
After Width: | Height: | Size: 174 KiB |
BIN
schritt_1_experte/schritt1experte60.jpg
Normal file
|
After Width: | Height: | Size: 159 KiB |
BIN
schritt_1_experte/schritt1experte61.jpg
Normal file
|
After Width: | Height: | Size: 159 KiB |
BIN
schritt_1_experte/schritt1experte62.jpg
Normal file
|
After Width: | Height: | Size: 159 KiB |
BIN
schritt_1_experte/schritt1experte7.jpg
Normal file
|
After Width: | Height: | Size: 168 KiB |
BIN
schritt_1_experte/schritt1experte8.jpg
Normal file
|
After Width: | Height: | Size: 162 KiB |
BIN
schritt_1_experte/schritt1experte9.jpg
Normal file
|
After Width: | Height: | Size: 158 KiB |
444
steps.py
Normal file
|
|
@ -0,0 +1,444 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
# imx500_gui_v4/steps.py
|
||||
|
||||
from dataclasses import dataclass
|
||||
from typing import Dict
|
||||
import numpy as np
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class StepInfo:
|
||||
title: str
|
||||
body: str
|
||||
|
||||
|
||||
# -------------------------
|
||||
# SCHUELER (DE) - 4 Steps
|
||||
# -------------------------
|
||||
STEP_TEXT_SCHUELER_DE: Dict[int, StepInfo] = {
|
||||
1: StepInfo(
|
||||
title="1. Input & Vorverarbeitung: Das digitale Auge",
|
||||
body=(
|
||||
"Bevor die KI loslegt, muss das Bild „vorbereitet“ werden. Die Kamera liefert Millionen bunter Pixel, "
|
||||
"doch das KI-Gehirn hat einen Tunnelblick. Es schneidet unwichtige Bereiche weg und verkleinert das Bild "
|
||||
"(Aspect-Ratio), damit die Berechnungen blitzschnell in Echtzeit funktionieren. Schönheit spielt hier keine Rolle – "
|
||||
"nur Effizienz zählt!\n\n"
|
||||
"Was passiert: Zuschneiden, Verkleinern, Farbanpassung.\n"
|
||||
"Die Sicht der KI: Ein etwas verzerrtes, pixeliges Quadrat."
|
||||
),
|
||||
),
|
||||
2: StepInfo(
|
||||
title="2. Feature Extraction: Muster erkennen",
|
||||
body=(
|
||||
"Jetzt sucht der KI-Chip (wie der IMX500) nach markanten Merkmalen. Anstatt das ganze Bild auf einmal zu verstehen, "
|
||||
"zerlegt die KI es in Bausteine. Sie sucht nach Kanten, Rundungen oder Farbübergängen.\n\n"
|
||||
"Der Prozess: Mathematische Filter gleiten über das Bild.\n"
|
||||
"Das Ziel: Aus einfachen Linien werden komplexe Formen (z. B. „zwei Kreise über einer geraden Linie“ könnten die Räder eines Autos sein)."
|
||||
),
|
||||
),
|
||||
3: StepInfo(
|
||||
title="3. Localization: Wo ist das Objekt?",
|
||||
body=(
|
||||
"Sobald die KI interessante Muster gefunden hat, zeichnet sie virtuelle Rahmen, die sogenannten Bounding Boxes. "
|
||||
"Da die KI anfangs nur rät, entstehen oft hunderte überlappende Rahmen. Jeder Rahmen wird durch ein mathematisches "
|
||||
"Koordinatensystem definiert:\n\n"
|
||||
"[x, y, width, height]\n"
|
||||
"x, y: Der Startpunkt des Rahmens (meist oben links).\n"
|
||||
"width/height: Wie breit und hoch die Box ist."
|
||||
),
|
||||
),
|
||||
4: StepInfo(
|
||||
title="4. Classification & Confidence: Was ist es und wie sicher sind wir?",
|
||||
body=(
|
||||
"Im letzten Schritt wird aufgeräumt. Die KI vergleicht die gefundenen Muster in den Boxen mit ihrem Training "
|
||||
"und gibt dem Objekt einen Namen (Label) und eine Sicherheit (Confidence).\n\n"
|
||||
"Classification: Jedem Rahmen wird ein Label (Name) zugeordnet.\n"
|
||||
"Confidence Scoring: Die KI gibt einen Prozentwert an (z. B. 0.95 für 95%). Alles, was unter einem Schwellenwert liegt "
|
||||
"(z. B. unter 50%), wird gelöscht. Übrig bleibt nur das sauber markierte Endergebnis.\n\n"
|
||||
"Merksatz: Die KI „sieht“ keine Bilder, sie berechnet Wahrscheinlichkeiten in Rahmen!"
|
||||
),
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
# -------------------------
|
||||
# STUDIS (DE) - 7 Steps
|
||||
# -------------------------
|
||||
STEP_TEXT_STUDIS_DE: Dict[int, StepInfo] = {
|
||||
1: StepInfo(
|
||||
title="1. Data Acquisition & RGB-Input",
|
||||
body=(
|
||||
"Der Prozess beginnt mit der Rohdatenerfassung. Der IMX500-Sensor liefert einen Videostream im RGB-Farbraum. "
|
||||
"Was für uns wie ein semantisches Bild aussieht, ist für den Algorithmus zunächst nur eine unstrukturierte Matrix "
|
||||
"aus Pixelwerten (Intensitäten von 0 bis 255 pro Farbkanal).\n\n"
|
||||
"Didaktischer Hinweis: KI beginnt nicht mit ‚Verstehen‘, sondern mit statistischer Signalverarbeitung. "
|
||||
"Die Qualität dieses Inputs (Belichtung, Rauschen) bestimmt maßgeblich die spätere Erkennungsleistung "
|
||||
"(‚Garbage In, Garbage Out‘-Prinzip)."
|
||||
),
|
||||
),
|
||||
2: StepInfo(
|
||||
title="2. Input Normalization & Resizing",
|
||||
body=(
|
||||
"Neuronale Netze haben eine fixe Eingangsgröße (Input Layer), hier z. B. 300x300 Pixel. "
|
||||
"Das hochauflösende Bild muss daher herunterskaliert und normalisiert werden. Dabei entstehen zwangsläufig "
|
||||
"Informationsverluste und Verzerrungen (Aliasing). Der Algorithmus sieht also nie die Realität, sondern nur "
|
||||
"eine stark komprimierte Abstraktion davon.\n\n"
|
||||
"Dies ist ein kritischer Flaschenhals: Kleine Objekte verschwinden hier oft bereits, bevor die eigentliche Analyse beginnt."
|
||||
),
|
||||
),
|
||||
3: StepInfo(
|
||||
title="3. Convolutional Neural Network (CNN)",
|
||||
body=(
|
||||
"Nun erfolgt die Inferenz (Schlussfolgerung) auf dem AI-Accelerator. Ein Convolutional Neural Network (CNN) extrahiert "
|
||||
"hierarchische Merkmale. In den ersten Schichten (Layers) erkennen Filter einfache Geometrien wie Kanten und Ecken. "
|
||||
"In tieferen Schichten werden diese zu komplexen Mustern (z. B. ‚Auge‘, ‚Rad‘) kombiniert.\n\n"
|
||||
"Dies ist keine Magie, sondern reine Matrixmultiplikation: Der Input wird durch gewichtete Filtermatrizen transformiert, "
|
||||
"um relevante Features hervorzuheben."
|
||||
),
|
||||
),
|
||||
4: StepInfo(
|
||||
title="4. Raw Output Tensor",
|
||||
body=(
|
||||
"Das Ergebnis der Inferenz ist kein Bild, sondern ein Tensor (ein mehrdimensionales Array). Dieser Vektor enthält "
|
||||
"tausende numerische Werte, die codierte Informationen über Box-Koordinaten, Klassen-Wahrscheinlichkeiten und Objekt-Scores enthalten. "
|
||||
"Das Modell ‚weiß‘ zu diesem Zeitpunkt noch nicht, was ein Objekt ist – es liefert lediglich eine Wahrscheinlichkeitsverteilung "
|
||||
"über den gesamten Bildraum zurück."
|
||||
),
|
||||
),
|
||||
5: StepInfo(
|
||||
title="5. Bounding Box Regression",
|
||||
body=(
|
||||
"Der Output-Tensor wird decodiert (geparst). Das Modell generiert basierend auf gelernten ‚Anchor Boxes‘ hunderte von Hypothesen, "
|
||||
"wo sich Objekte befinden könnten. Da das Netz probabilistisch arbeitet, wird für fast jeden Bildbereich eine Vermutung geäußert – "
|
||||
"auch für den Hintergrund (Rauschen).\n\n"
|
||||
"Wir sehen hier die ‚Unsicherheit‘ der KI: Sie diskriminiert noch nicht zwischen relevantem Objekt und statistischem Rauschen."
|
||||
),
|
||||
),
|
||||
6: StepInfo(
|
||||
title="6. Non-Maximum Suppression (NMS)",
|
||||
body=(
|
||||
"Um aus dem Chaos valide Ergebnisse zu filtern, werden zwei Algorithmen angewandt:\n\n"
|
||||
"Confidence Thresholding: Alle Hypothesen unter einem Schwellenwert (z. B. 50%) werden verworfen.\n\n"
|
||||
"Non-Maximum Suppression (NMS): Überlappen sich mehrere Boxen für dasselbe Objekt (hohe ‚Intersection over Union‘), "
|
||||
"wird nur diejenige mit der höchsten Wahrscheinlichkeit behalten.\n\n"
|
||||
"Dies ist der Entscheidungsschritt, bei dem die KI ‚auswählt‘."
|
||||
),
|
||||
),
|
||||
7: StepInfo(
|
||||
title="7. Final Output & Bias Check",
|
||||
body=(
|
||||
"Die normalisierten Koordinaten werden auf die Originalauflösung zurückgerechnet (Upscaling). Kritische Reflexion: "
|
||||
"Das Label (z. B. ‚Person‘) stammt aus dem Trainingsdatensatz (z. B. COCO). Kennt das Modell ein Objekt nicht, wird es das "
|
||||
"optisch ähnlichste Label wählen (Fehlklassifikation). KI ist also nie objektiv, sondern immer abhängig von den Daten, "
|
||||
"mit denen sie trainiert wurde.\n\n"
|
||||
"Interesse an der Technik? Infos zu unseren Informatik-Modulen gibt es am Stand XY."
|
||||
),
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
# -------------------------
|
||||
# STUDIS Gate texts (DE)
|
||||
# -------------------------
|
||||
GATE_TEXT_STUDIS_DE: Dict[int, StepInfo] = {
|
||||
1: StepInfo(
|
||||
title="Schritt 1 – Camera Capture (Datenerfassung)",
|
||||
body=(
|
||||
"Bevor ein System Objekte erkennen kann, muss zuerst ein Bild aufgenommen werden. Die Kamera liefert dabei nicht sofort eine „Szene“, "
|
||||
"wie wir Menschen sie wahrnehmen, sondern nur eine große Menge an Zahlen: eine Matrix aus Pixelwerten. In der Animation siehst du, "
|
||||
"wie das aufgenommene Bild in Pixel unterteilt wird. Jedes Pixel enthält Intensitäten für Rot, Grün und Blau.\n\n"
|
||||
"Für die KI ist dieses Rohbild also kein „Hund“ oder „Mensch“, sondern reine Sensordaten. Deshalb ist dieser Schritt grundlegend: "
|
||||
"Alles, was später erkannt werden soll, hängt davon ab, wie gut die Daten am Anfang sind.\n\n"
|
||||
"Wenn das Bild zum Beispiel unscharf ist oder stark rauscht, kann das Modell später kaum noch zuverlässige Ergebnisse liefern. "
|
||||
"Dieses Prinzip nennt man oft „Garbage In, Garbage Out“: Schlechte Eingaben führen zu schlechten Ausgaben."
|
||||
),
|
||||
),
|
||||
2: StepInfo(
|
||||
title="Schritt 2 – Pre-Processing (Skalierung)",
|
||||
body=(
|
||||
"Nach der Aufnahme muss das Bild für das neuronale Netz vorbereitet werden. Der Sensor liefert ein großes Rohbild, das anschließend "
|
||||
"skaliert wird. Dies wird über Skalierer festgestellt, danach wird ein Ausschnitt ausgewählt.\n\n"
|
||||
"Denn die Modelle arbeiten nicht mit beliebigen Bildgrößen, sondern erwarten einen festen Input, beispielsweise 300 × 300 Pixel. "
|
||||
"Deshalb wird das Bild, wie in der Animation veranschaulicht, auf einen bestimmten Bereich skaliert.\n\n"
|
||||
"Dabei gehen jedoch oft Informationen verloren. Feine Details oder kleinere Objekte können verschwinden. Dieser Effekt ist in der "
|
||||
"Bildverarbeitung bekannt und wird oft als Aliasing bezeichnet.\n\n"
|
||||
"Wichtig ist: Das Modell sieht nicht die Realität, sondern nur eine komprimierte Version davon. Der Schritt des Pre-Processing wird "
|
||||
"daher als kritisch betrachtet, da alles, was hierbei verloren geht, später nicht mehr zurückgeholt werden kann."
|
||||
),
|
||||
),
|
||||
3: StepInfo(
|
||||
title="Schritt 3 – Inferenz (Feature Extraction)",
|
||||
body=(
|
||||
"Jetzt beginnt die eigentliche Analyse. Das neuronale Netz verarbeitet das Bild mit sogenannten Convolutional Neural Networks, den CNNs. "
|
||||
"Diese extrahieren Merkmale in mehreren Schichten.\n\n"
|
||||
"Hier wird dargestellt, wie das Netz in den ersten Layern einfache Muster wie Kanten oder Kontraste erkennt. In tieferen Schichten "
|
||||
"entstehen daraus komplexere Strukturen – etwa Objektteile wie „Rad“ oder „Auge“.\n\n"
|
||||
"Dieser Prozess wirkt oft „intelligent“, basiert aber mathematisch auf Faltungen und Matrixmultiplikationen. Faltung bedeutet: "
|
||||
"Ein kleines Muster-Prüfwerkzeug wird über das Bild geschoben, um wichtige Bildmerkmale wie Kanten, Formen oder Texturen herauszufiltern. "
|
||||
"Das Netz versteht also nicht wie ein Mensch, sondern transformiert Bilddaten statistisch, um relevante Features hervorzuheben.\n\n"
|
||||
"Dieser Schritt ist entscheidend, weil hier visuelle Information erstmals in eine interne Repräsentation für die KI übersetzt wird."
|
||||
),
|
||||
),
|
||||
4: StepInfo(
|
||||
title="Schritt 4 – Tensor Readout (Abstrakter Output)",
|
||||
body=(
|
||||
"Nach der Inferenz entsteht kein neues Bild, sondern ein Tensor: ein mehrdimensionales Zahlenfeld. Dieser Tensor enthält tausende Werte, "
|
||||
"die mögliche Objektpositionen, Klassenwahrscheinlichkeiten und Scores codieren.\n\n"
|
||||
"Das Modell „sieht“ also keine Objekte, sondern berechnet Wahrscheinlichkeiten. Es liefert keine symbolische Aussage wie "
|
||||
"„Hier ist ein Mensch“, sondern mathematische Hinweise darauf, wo etwas sein könnte.\n\n"
|
||||
"Dieser Moment zeigt besonders gut: KI arbeitet nicht mit Bedeutung, sondern mit Statistik. Erst in den nächsten Schritten wird aus diesem "
|
||||
"abstrakten Output wieder etwas, das für uns als Ergebnis interpretierbar ist."
|
||||
),
|
||||
),
|
||||
5: StepInfo(
|
||||
title="Schritt 5 – Decoding & Proposals (Hypothesen)",
|
||||
body=(
|
||||
"Nun wird der Tensor decodiert. Das Modell erzeugt eine große Anzahl an Vorschlägen, wo sich Objekte befinden könnten. Diese sogenannten "
|
||||
"Proposals basieren oft auf Anchor Boxes, die viele Positionen und Größen abdecken, wie du in der Animation sehen kannst.\n\n"
|
||||
"Dabei ist das System bewusst großzügig: Lieber entstehen zu viele Hypothesen als zu wenige, damit kein Objekt übersehen wird. Deshalb "
|
||||
"sieht man in dieser Phase oft ein „Chaos“ aus Rahmen – viele davon sind falsch oder unsicher.\n\n"
|
||||
"Objekterkennung ist also kein deterministischer Prozess, sondern ein probabilistisches Raten: Das Modell stellt Vermutungen auf, bevor es "
|
||||
"auswählen kann."
|
||||
),
|
||||
),
|
||||
6: StepInfo(
|
||||
title="Schritt 6 – NMS & Thresholding (Filterung)",
|
||||
body=(
|
||||
"Damit aus den vielen Hypothesen ein sinnvolles Ergebnis entsteht, werden die Vorschläge jetzt gefiltert.\n\n"
|
||||
"Zuerst werden alle Boxen entfernt, deren Wahrscheinlichkeit unter einem bestimmten Schwellenwert, also einem Confidence Threshold, liegt. "
|
||||
"Danach folgt die Non-Maximum Suppression (NMS): Wenn mehrere Boxen dasselbe Objekt markieren und sich stark überlappen, bleibt nur die "
|
||||
"wahrscheinlichste übrig. So bleiben in unserem Beispiel die gelben und grünen Boxen erhalten.\n\n"
|
||||
"Hier entscheidet das System also, welche Detektion tatsächlich relevant ist. Dieser Schritt ist notwendig, weil moderne Detektionsmodelle "
|
||||
"systematisch mehrere Vorschläge für ein einzelnes Objekt erzeugen.\n\n"
|
||||
"Erst durch diese Filterung entsteht eine klare, interpretierbare Auswahl."
|
||||
),
|
||||
),
|
||||
7: StepInfo(
|
||||
title="Schritt 7 – Coordinate Mapping & Reflexion (Ergebnis)",
|
||||
body=(
|
||||
"Wie du siehst, werden im letzten Schritt die finalen Boxen auf die ursprüngliche Bildgröße zurückgerechnet und sichtbar angezeigt. Erst jetzt entsteht das "
|
||||
"Ergebnis, das wir als Objekterkennung wahrnehmen: Bounding Box, Label und Confidence Score.\n\n"
|
||||
"Doch dieser Schritt ist auch ein Punkt für kritische Reflexion: Die Labels stammen aus dem Trainingsdatensatz des Modells. Erkennt die KI "
|
||||
"ein Objekt nicht, wählt sie oft das ähnlichste bekannte Label – das kann zu Fehlklassifikationen führen.\n\n"
|
||||
"Modelle sind also nicht objektiv, sondern abhängig von den Daten, mit denen sie trainiert wurden. Die Visualisierung ist deshalb nicht nur "
|
||||
"ein technischer Abschluss, sondern auch eine Einladung, die Grenzen der algorithmischen Wahrnehmung zu hinterfragen."
|
||||
),
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
# -------------------------
|
||||
# STUDIS Gate texts (EN)
|
||||
# -------------------------
|
||||
GATE_TEXT_STUDIS_EN: Dict[int, StepInfo] = {
|
||||
1: StepInfo(
|
||||
title="Step 1 – Camera Capture (Data acquisition)",
|
||||
body=(
|
||||
"Before a system can detect objects, it first has to capture an image. The camera does not deliver a “scene” the way humans perceive it, "
|
||||
"but a large set of numbers: a matrix of pixel values. In the animation you can see how the captured image is divided into pixels. "
|
||||
"Each pixel contains intensities for red, green, and blue.\n\n"
|
||||
"For the AI, this raw image is not a “dog” or a “person” yet—it is purely sensor data. That is why this step is fundamental: everything "
|
||||
"the model can detect later depends on the quality of the initial data.\n\n"
|
||||
"If the image is blurry or very noisy, the model will hardly be able to produce reliable results afterwards. This is often summarized as "
|
||||
"“Garbage In, Garbage Out”: poor inputs lead to poor outputs."
|
||||
),
|
||||
),
|
||||
2: StepInfo(
|
||||
title="Step 2 – Pre-processing (Normalization)",
|
||||
body=(
|
||||
"After capturing the image, it must be prepared for the neural network. Models do not accept arbitrary image sizes; they expect a fixed input, "
|
||||
"for example 300×300 pixels.\n\n"
|
||||
"That is why the image is downscaled and normalized (as shown in the animation). However, this always causes information loss: fine details "
|
||||
"or small objects can disappear. In image processing this effect is often described as aliasing.\n\n"
|
||||
"The key point is: the model does not see reality—it only sees a compressed version of it. Pre-processing is therefore a critical bottleneck: "
|
||||
"whatever is lost here cannot be recovered later."
|
||||
),
|
||||
),
|
||||
3: StepInfo(
|
||||
title="Step 3 – Inference (Feature extraction)",
|
||||
body=(
|
||||
"Now the actual analysis begins. The neural network processes the image using Convolutional Neural Networks (CNNs), which extract features "
|
||||
"across multiple layers.\n\n"
|
||||
"The visualization illustrates how early layers detect simple patterns such as edges or contrasts. Deeper layers combine these into more complex "
|
||||
"structures—object parts like a “wheel” or an “eye”.\n\n"
|
||||
"This can look “intelligent”, but mathematically it is based on convolutions and matrix multiplications. Convolution means sliding a small "
|
||||
"pattern-checking tool over the image to highlight important visual cues like edges, shapes, or textures. The network does not understand like a human; "
|
||||
"it transforms image data statistically to emphasize relevant features.\n\n"
|
||||
"This step is crucial because it is where visual information is translated into the AI’s internal representation."
|
||||
),
|
||||
),
|
||||
4: StepInfo(
|
||||
title="Step 4 – Tensor readout (Abstract output)",
|
||||
body=(
|
||||
"After inference, the result is not a new image, but a tensor: a multi-dimensional field of numbers. This tensor contains thousands of values "
|
||||
"encoding potential object locations, class probabilities, and scores.\n\n"
|
||||
"The model does not “see” objects—it computes probabilities. It does not output a symbolic statement like “There is a person here”, but rather "
|
||||
"mathematical hints about where something might be.\n\n"
|
||||
"This step makes it especially clear: AI does not operate on meaning, but on statistics. Only in the next steps does this abstract output become "
|
||||
"something we can interpret as a result."
|
||||
),
|
||||
),
|
||||
5: StepInfo(
|
||||
title="Step 5 – Decoding & proposals (Hypotheses)",
|
||||
body=(
|
||||
"Next, the tensor is decoded. The model produces a large number of proposals for where objects might be. These proposals often rely on anchor boxes "
|
||||
"covering many positions and sizes.\n\n"
|
||||
"The system is intentionally generous at this stage: it prefers too many hypotheses over too few, to avoid missing objects. That is why this phase "
|
||||
"often looks like “chaos” with many boxes—many of them are wrong or uncertain.\n\n"
|
||||
"Object detection is therefore not deterministic; it is probabilistic guessing. The model generates hypotheses before it can select the most plausible ones."
|
||||
),
|
||||
),
|
||||
6: StepInfo(
|
||||
title="Step 6 – NMS & thresholding (Filtering)",
|
||||
body=(
|
||||
"To turn many hypotheses into a meaningful result, the proposals are filtered.\n\n"
|
||||
"First, all boxes with a probability below a certain threshold (a confidence threshold) are removed. Then Non-Maximum Suppression (NMS) is applied: "
|
||||
"if multiple boxes mark the same object and overlap strongly, only the most probable one remains.\n\n"
|
||||
"This is where the system decides which detections are actually relevant. The step is necessary because modern detectors systematically create multiple "
|
||||
"proposals for a single object.\n\n"
|
||||
"Only after this filtering do we get a clear and interpretable selection."
|
||||
),
|
||||
),
|
||||
7: StepInfo(
|
||||
title="Step 7 – Coordinate mapping & reflection (Result)",
|
||||
body=(
|
||||
"In the final step, the selected boxes are mapped back to the original image size and displayed. Only now do we see what we perceive as object detection: "
|
||||
"bounding box, label, and confidence score.\n\n"
|
||||
"But this is also a point for critical reflection: labels come from the model’s training dataset. If the AI has never learned an object, it will often choose "
|
||||
"the most similar known label—leading to misclassifications.\n\n"
|
||||
"Models are not objective; they depend on the data they were trained on. The visualization is therefore not only a technical conclusion, but also an invitation "
|
||||
"to reflect on the limits of algorithmic perception."
|
||||
),
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
def total_steps_for_level(level: str) -> int:
|
||||
lv = (level or "").upper()
|
||||
if lv in ("SCHUELER", "SCHÜLER", "PUPIL", "SCHOOL", "SCHOOLER"):
|
||||
return 4
|
||||
if lv in ("STUDENT", "STUDIS", "STUDI", "STUDENTS"):
|
||||
return 7
|
||||
return 7
|
||||
|
||||
|
||||
def build_step_text(level: str, lang: str = "DE", debug: object | None = None, **kwargs) -> Dict[int, StepInfo]:
|
||||
_ = (lang, debug, kwargs)
|
||||
lv = (level or "").upper()
|
||||
if lv in ("SCHUELER", "SCHÜLER", "PUPIL", "SCHOOL", "SCHOOLER"):
|
||||
return STEP_TEXT_SCHUELER_DE
|
||||
return STEP_TEXT_STUDIS_DE
|
||||
|
||||
|
||||
def build_gate_text(level: str, lang: str = "DE", debug: object | None = None, **kwargs) -> Dict[int, StepInfo]:
|
||||
"""
|
||||
Black intermediate screens shown between analysis steps.
|
||||
Defined for STUDENT (DE/EN). Others return empty dict.
|
||||
"""
|
||||
_ = (debug, kwargs)
|
||||
lv = (level or "").upper()
|
||||
lg = (lang or "DE").upper()
|
||||
is_student = lv in ("STUDENT", "STUDIS", "STUDI", "STUDENTS")
|
||||
|
||||
if not is_student:
|
||||
return {}
|
||||
|
||||
if lg.startswith("EN"):
|
||||
return GATE_TEXT_STUDIS_EN
|
||||
return GATE_TEXT_STUDIS_DE
|
||||
|
||||
|
||||
# -------------------------
|
||||
# StepTransformer (Visual simulation)
|
||||
# -------------------------
|
||||
class StepTransformer:
|
||||
@staticmethod
|
||||
def to_gray(rgb: np.ndarray) -> np.ndarray:
|
||||
r = rgb[:, :, 0].astype(np.float32)
|
||||
g = rgb[:, :, 1].astype(np.float32)
|
||||
b = rgb[:, :, 2].astype(np.float32)
|
||||
y = 0.299 * r + 0.587 * g + 0.114 * b
|
||||
return y.astype(np.uint8)
|
||||
|
||||
@staticmethod
|
||||
def sobel_edges(gray: np.ndarray) -> np.ndarray:
|
||||
g = gray.astype(np.int16)
|
||||
gx = (np.roll(g, -1, axis=1) - np.roll(g, 1, axis=1))
|
||||
gy = (np.roll(g, -1, axis=0) - np.roll(g, 1, axis=0))
|
||||
mag = np.clip(np.abs(gx) + np.abs(gy), 0, 255).astype(np.uint8)
|
||||
return mag
|
||||
|
||||
@staticmethod
|
||||
def pixelate_and_square(rgb: np.ndarray, size: int = 300) -> np.ndarray:
|
||||
h, w, _ = rgb.shape
|
||||
ys = (np.linspace(0, h - 1, size)).astype(np.int32)
|
||||
xs = (np.linspace(0, w - 1, size)).astype(np.int32)
|
||||
small = rgb[ys][:, xs]
|
||||
ys2 = (np.linspace(0, size - 1, h)).astype(np.int32)
|
||||
xs2 = (np.linspace(0, size - 1, w)).astype(np.int32)
|
||||
up = small[ys2][:, xs2]
|
||||
return up.astype(np.uint8)
|
||||
|
||||
@staticmethod
|
||||
def fake_feature_map(rgb: np.ndarray) -> np.ndarray:
|
||||
g = StepTransformer.to_gray(rgb)
|
||||
e = StepTransformer.sobel_edges(g).astype(np.float32)
|
||||
if e.max() > 0:
|
||||
e = (e / e.max()) * 255.0
|
||||
e = e.astype(np.uint8)
|
||||
r = np.clip(e * 0.25, 0, 255).astype(np.uint8)
|
||||
gg = np.clip(e * 0.85, 0, 255).astype(np.uint8)
|
||||
b = np.clip(e * 1.10, 0, 255).astype(np.uint8)
|
||||
return np.stack([r, gg, b], axis=2)
|
||||
|
||||
@staticmethod
|
||||
def dim(rgb: np.ndarray, factor: float) -> np.ndarray:
|
||||
return np.clip(rgb.astype(np.float32) * factor, 0, 255).astype(np.uint8)
|
||||
|
||||
@staticmethod
|
||||
def matrix_like_overlay(rgb: np.ndarray) -> np.ndarray:
|
||||
out = StepTransformer.dim(rgb, 0.12)
|
||||
h, w, _ = out.shape
|
||||
rng = np.random.default_rng(1)
|
||||
for y in range(30, h, max(22, h // 28)):
|
||||
x0 = rng.integers(0, max(1, w - 200))
|
||||
length = rng.integers(140, 360)
|
||||
x1 = int(min(w - 1, x0 + length))
|
||||
out[y:y + 2, x0:x1, :] = np.array([220, 240, 210], dtype=np.uint8)
|
||||
for _ in range(30):
|
||||
y = rng.integers(0, h)
|
||||
x = rng.integers(0, w)
|
||||
out[y:y + 1, x:x + 1, :] = np.array([227, 217, 191], dtype=np.uint8)
|
||||
return out
|
||||
|
||||
def apply(self, frame_rgb: np.ndarray, step: int, level: str | None = None) -> np.ndarray:
|
||||
if frame_rgb is None:
|
||||
return frame_rgb
|
||||
|
||||
lv = (level or "").upper()
|
||||
|
||||
# -------- SCHUELER (4 steps) visuals --------
|
||||
if lv in ("SCHUELER", "SCHÜLER", "PUPIL", "SCHOOL", "SCHOOLER"):
|
||||
if step == 1:
|
||||
return self.pixelate_and_square(frame_rgb, size=300)
|
||||
if step == 2:
|
||||
return self.fake_feature_map(frame_rgb)
|
||||
if step == 3:
|
||||
return self.dim(frame_rgb, 0.55)
|
||||
return frame_rgb
|
||||
|
||||
# -------- STUDIS (7 steps) visuals --------
|
||||
if step == 1:
|
||||
# Change: Pixelated via 300x300 bottleneck, matches Step 2 visuals but displayed in 16:9
|
||||
return self.pixelate_and_square(frame_rgb, size=300)
|
||||
if step == 2:
|
||||
return self.pixelate_and_square(frame_rgb, size=300)
|
||||
if step == 3:
|
||||
return self.fake_feature_map(frame_rgb)
|
||||
if step == 4:
|
||||
return self.matrix_like_overlay(frame_rgb)
|
||||
if step == 5:
|
||||
return self.dim(frame_rgb, 0.35)
|
||||
if step == 6:
|
||||
return self.dim(frame_rgb, 0.70)
|
||||
return frame_rgb
|
||||
1
ui/__init__.py
Normal file
|
|
@ -0,0 +1 @@
|
|||
|
||||
BIN
ui/__pycache__/__init__.cpython-311.pyc
Normal file
BIN
ui/__pycache__/renderer.cpython-311.pyc
Normal file
BIN
ui/__pycache__/textlayout.cpython-311.pyc
Normal file
BIN
ui/__pycache__/theme.cpython-311.pyc
Normal file
158
ui/renderer.py
Normal file
|
|
@ -0,0 +1,158 @@
|
|||
# imx500_gui/ui/renderer.py
|
||||
|
||||
from typing import Tuple, List, Optional
|
||||
import pygame
|
||||
import pygame.surfarray as surfarray
|
||||
|
||||
from .theme import Theme
|
||||
|
||||
|
||||
def clamp01(x: float) -> float:
|
||||
return max(0.0, min(1.0, float(x)))
|
||||
|
||||
|
||||
class Renderer:
|
||||
"""Zeichnet alle Panels, Texte, Balken und Video-Overlays."""
|
||||
|
||||
def __init__(self, theme: Theme):
|
||||
self.t = theme
|
||||
|
||||
def conf_color(self, conf: float) -> Tuple[int, int, int]:
|
||||
return self.t.GOOD if conf >= 0.60 else self.t.WARN
|
||||
|
||||
def draw_card(self, surface, rect, fill=None, outline=None, radius=None):
|
||||
fill = self.t.PANEL if fill is None else fill
|
||||
outline = self.t.LINE if outline is None else outline
|
||||
radius = self.t.RADIUS if radius is None else radius
|
||||
pygame.draw.rect(surface, fill, rect, border_radius=radius)
|
||||
pygame.draw.rect(surface, outline, rect, width=1, border_radius=radius)
|
||||
|
||||
def draw_text(self, surface, font, text, pos, color=None):
|
||||
color = self.t.TEXT if color is None else color
|
||||
surface.blit(font.render(text, True, color), pos)
|
||||
|
||||
def draw_button(self, surface, rect, label, font, primary=False, border_width=1):
|
||||
"""Einheitliche Button-Darstellung."""
|
||||
if primary:
|
||||
fill = self.t.BTN_PRIMARY_FILL
|
||||
outline = self.t.BTN_PRIMARY_BORDER
|
||||
text_col = self.t.BTN_PRIMARY_BORDER
|
||||
else:
|
||||
fill = self.t.PANEL_2
|
||||
outline = self.t.LINE
|
||||
text_col = self.t.TEXT
|
||||
|
||||
pygame.draw.rect(surface, fill, rect, border_radius=self.t.RADIUS)
|
||||
pygame.draw.rect(surface, outline, rect, width=border_width, border_radius=self.t.RADIUS)
|
||||
|
||||
txt_surf = font.render(label, True, text_col)
|
||||
txt_rect = txt_surf.get_rect(center=rect.center)
|
||||
surface.blit(txt_surf, txt_rect)
|
||||
|
||||
def draw_pill(self, surface, font, text, pos, bg=(20, 20, 20, 180), fg=None):
|
||||
fg = self.t.TEXT if fg is None else fg
|
||||
txt = font.render(text, True, fg)
|
||||
w, h = txt.get_size()
|
||||
pill_rect = pygame.Rect(pos[0], pos[1], w + 16, h + 8)
|
||||
|
||||
# Draw translucent background
|
||||
s = pygame.Surface((pill_rect.w, pill_rect.h), pygame.SRCALPHA)
|
||||
s.fill(bg)
|
||||
surface.blit(s, pill_rect.topleft)
|
||||
|
||||
surface.blit(txt, (pos[0] + 8, pos[1] + 4))
|
||||
|
||||
def rect_in_video_coords(self, box, src_size, video_rect) -> pygame.Rect:
|
||||
"""
|
||||
Transformiert eine Box (x,y,w,h) von src_size (Kamera-Auflösung)
|
||||
in die Koordinaten des Video-Panels auf dem Screen.
|
||||
"""
|
||||
sx, sy, sw, sh = box
|
||||
src_w, src_h = src_size
|
||||
|
||||
scale_x = video_rect.w / src_w
|
||||
scale_y = video_rect.h / src_h
|
||||
|
||||
rx = video_rect.x + int(sx * scale_x)
|
||||
ry = video_rect.y + int(sy * scale_y)
|
||||
rw = int(sw * scale_x)
|
||||
rh = int(sh * scale_y)
|
||||
return pygame.Rect(rx, ry, rw, rh)
|
||||
|
||||
def draw_step_indicator(self, surface, rect, step, total_steps, font_small):
|
||||
# Background rail
|
||||
pygame.draw.rect(surface, self.t.PANEL_2, rect, border_radius=rect.height // 2)
|
||||
|
||||
if total_steps < 1:
|
||||
return
|
||||
|
||||
# Width of one segment
|
||||
seg_w = rect.w / total_steps
|
||||
|
||||
# Fill active steps
|
||||
if step > 0:
|
||||
fill_w = step * seg_w
|
||||
fill_rect = pygame.Rect(rect.x, rect.y, fill_w, rect.h)
|
||||
pygame.draw.rect(surface, self.t.ACCENT, fill_rect, border_radius=rect.height // 2)
|
||||
|
||||
# Draw segment separators
|
||||
for i in range(1, total_steps):
|
||||
x = rect.x + i * seg_w
|
||||
pygame.draw.line(surface, self.t.BG, (x, rect.y), (x, rect.bottom), 2)
|
||||
|
||||
# Draw Text Indicator above
|
||||
# Hier wurde das Padding erhöht (rect.y - 12 statt -4)
|
||||
label = f"{step} / {total_steps}"
|
||||
txt = font_small.render(label, True, self.t.TEXT_MUTED)
|
||||
txt_rect = txt.get_rect(centerx=rect.centerx, bottom=rect.y - 12)
|
||||
surface.blit(txt, txt_rect)
|
||||
|
||||
def draw_pixel_grid(self, surface, rect, spacing=20):
|
||||
"""Simuliert Pixel-Raster."""
|
||||
col = (255, 255, 255, 30) # sehr transparent
|
||||
|
||||
# Vertikale Linien
|
||||
for x in range(rect.x, rect.right, spacing):
|
||||
pygame.draw.line(surface, col, (x, rect.y), (x, rect.bottom))
|
||||
|
||||
# Horizontale Linien
|
||||
for y in range(rect.y, rect.bottom, spacing):
|
||||
pygame.draw.line(surface, col, (rect.x, y), (rect.right, y))
|
||||
|
||||
def draw_bar_chart(self, surface, rect, top3: List[Tuple[str, float]], threshold: float, title_font, body_font):
|
||||
"""Zeichnet Balkendiagramm für Top-3 Predictions."""
|
||||
|
||||
pad_top = title_font.get_linesize() + 10
|
||||
pad_left = 14
|
||||
pad_right = 14
|
||||
|
||||
row_h = 28
|
||||
gap = 12
|
||||
|
||||
chart_x = rect.x + pad_left
|
||||
chart_y = rect.y + pad_top
|
||||
chart_w = rect.w - pad_left - pad_right
|
||||
chart_h = 3 * row_h + 2 * gap
|
||||
|
||||
# Threshold Line
|
||||
thr_x = chart_x + int(chart_w * clamp01(threshold))
|
||||
pygame.draw.line(surface, self.t.ACCENT, (thr_x, chart_y - 8), (thr_x, chart_y + chart_h + 8), 2)
|
||||
|
||||
if not top3:
|
||||
# self.draw_text(surface, body_font, "...", (chart_x, chart_y), self.t.TEXT_MUTED)
|
||||
return
|
||||
|
||||
for i, (lab, conf) in enumerate(top3[:3]):
|
||||
y = chart_y + i * (row_h + gap)
|
||||
|
||||
# Hintergrund
|
||||
pygame.draw.rect(surface, (28, 31, 37), pygame.Rect(chart_x, y, chart_w, row_h), border_radius=10)
|
||||
|
||||
# Balken
|
||||
bar_col = self.conf_color(conf)
|
||||
bw = int(chart_w * clamp01(conf))
|
||||
pygame.draw.rect(surface, bar_col, pygame.Rect(chart_x, y, bw, row_h), border_radius=10)
|
||||
|
||||
# Label Text
|
||||
lbl_surf = body_font.render(f"{lab} {int(conf*100)}%", True, self.t.TEXT)
|
||||
surface.blit(lbl_surf, (chart_x + 8, y + (row_h - lbl_surf.get_height()) // 2))
|
||||
82
ui/textlayout.py
Normal file
|
|
@ -0,0 +1,82 @@
|
|||
# imx500_gui/ui/textlayout.py
|
||||
from typing import List, Tuple
|
||||
import pygame
|
||||
|
||||
|
||||
class TextLayout:
|
||||
"""Word-Wrap + Font-Fitting fuer Step-Beschreibungen."""
|
||||
|
||||
@staticmethod
|
||||
def wrap_lines(text: str, font: pygame.font.Font, max_w: int) -> List[str]:
|
||||
text = text.replace("\r\n", "\n").replace("\r", "\n")
|
||||
paragraphs = text.split("\n")
|
||||
lines: List[str] = []
|
||||
|
||||
for p in paragraphs:
|
||||
if p.strip() == "":
|
||||
lines.append("")
|
||||
continue
|
||||
|
||||
words = p.split(" ")
|
||||
cur = ""
|
||||
for w in words:
|
||||
test = (cur + " " + w).strip()
|
||||
if font.size(test)[0] <= max_w: # [web:177]
|
||||
cur = test
|
||||
else:
|
||||
if cur:
|
||||
lines.append(cur)
|
||||
cur = w
|
||||
if cur:
|
||||
lines.append(cur)
|
||||
return lines
|
||||
|
||||
@staticmethod
|
||||
def draw_wrapped_lines(surface, lines, font, color, x, y, line_spacing=4) -> int:
|
||||
line_h = font.get_linesize() + line_spacing # [web:177]
|
||||
for ln in lines:
|
||||
if ln == "":
|
||||
y += line_h
|
||||
continue
|
||||
surface.blit(font.render(ln, True, color), (x, y)) # [web:177]
|
||||
y += line_h
|
||||
return y
|
||||
|
||||
@staticmethod
|
||||
def split_title_body(title: str, body: str) -> Tuple[str, str]:
|
||||
# In der neuen Struktur bekommst du Titel und Body getrennt aus STEP_TEXT.
|
||||
return title.strip(), body.strip()
|
||||
|
||||
@staticmethod
|
||||
def fit_title_and_body(title: str, body: str, rect: pygame.Rect,
|
||||
min_body=18, max_body=40, title_ratio=1.30, line_spacing=4):
|
||||
title = title.strip()
|
||||
body = body.strip()
|
||||
|
||||
best_body = pygame.font.Font(None, min_body)
|
||||
best_title = pygame.font.Font(None, int(min_body * title_ratio))
|
||||
|
||||
lo, hi = min_body, max_body
|
||||
while lo <= hi:
|
||||
body_size = (lo + hi) // 2
|
||||
title_size = max(body_size + 4, int(body_size * title_ratio))
|
||||
|
||||
f_body = pygame.font.Font(None, body_size)
|
||||
f_title = pygame.font.Font(None, title_size)
|
||||
|
||||
title_lines = TextLayout.wrap_lines(title, f_title, rect.w)
|
||||
body_lines = TextLayout.wrap_lines(body, f_body, rect.w) if body else []
|
||||
|
||||
needed_h = 0
|
||||
needed_h += len(title_lines) * (f_title.get_linesize() + line_spacing) # [web:177]
|
||||
if body_lines:
|
||||
needed_h += (f_body.get_linesize() + line_spacing)
|
||||
needed_h += len(body_lines) * (f_body.get_linesize() + line_spacing)
|
||||
|
||||
if needed_h <= rect.h:
|
||||
best_body, best_title = f_body, f_title
|
||||
lo = body_size + 1
|
||||
else:
|
||||
hi = body_size - 1
|
||||
|
||||
return best_title, best_body
|
||||
22
ui/theme.py
Normal file
|
|
@ -0,0 +1,22 @@
|
|||
# imx500_gui/ui/theme.py
|
||||
|
||||
class Theme:
|
||||
BG = (10, 11, 13)
|
||||
PANEL = (16, 18, 22)
|
||||
PANEL_2 = (18, 20, 25)
|
||||
LINE = (34, 38, 45)
|
||||
|
||||
TEXT = (232, 234, 238)
|
||||
TEXT_MUTED = (170, 175, 184)
|
||||
BLACK = (0, 0, 0)
|
||||
|
||||
# Akzentfarben & Indikatoren
|
||||
ACCENT = (120, 120, 255)
|
||||
GOOD = (64, 214, 98)
|
||||
WARN = (245, 216, 88)
|
||||
|
||||
# Button-Styles (Landing Page & Primary Actions)
|
||||
BTN_PRIMARY_FILL = (23, 66, 115) # Dunkelblau
|
||||
BTN_PRIMARY_BORDER = (227, 217, 191) # Beige/Gold
|
||||
|
||||
RADIUS = 14
|
||||