Compare commits
13 Commits
58f7dd65f1
...
feat/docke
| Author | SHA1 | Date | |
|---|---|---|---|
| 420109b3ad | |||
| 30f8ca3863 | |||
| 7efba3ac5b | |||
|
|
cf1373cd68 | ||
|
|
bc875ef9fb | ||
|
|
c579b07843 | ||
|
|
d3f50cdadc | ||
|
|
8aa85e62e5 | ||
|
|
b9cf8a47f7 | ||
|
|
2e749228bb | ||
|
|
ce20fad4d3 | ||
|
|
401b23ce46 | ||
| 13dbf18f67 |
96
.planning/phases/05-tak-research/05-02-PLAN.md
Normal file
96
.planning/phases/05-tak-research/05-02-PLAN.md
Normal file
@@ -0,0 +1,96 @@
|
|||||||
|
# Phase 5.2: Compare Features and Select Optimal Solution
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
Analyze the research findings, create a feature comparison matrix, and finalize the selection of the optimal TAK-compatible server implementation.
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
### Task 1: Create Feature Comparison Matrix
|
||||||
|
|
||||||
|
Create a comprehensive comparison matrix based on the research findings in 05-01-RESEARCH.md:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
| Feature Category | FreeTAKServer | OpenTAKServer | TAK Product Center | Decision Criteria |
|
||||||
|
|------------------|---------------|---------------|--------------------|-------------------|
|
||||||
|
| **Core Features** | | | | | |
|
||||||
|
| COT Protocol Support | ✅ | ✅ | ✅ | Must have | ✅ |
|
||||||
|
| Web Interface | ✅ (basic) | ✅ (advanced) | ❌ | Must have | ✅ |
|
||||||
|
| Geospatial Mapping | ✅ (OSM) | ✅ (OSM + custom) | ✅ | Must have | ✅ |
|
||||||
|
| Docker Support | ✅ | ✅ | ❌ | Must have | ✅ |
|
||||||
|
| **Deployment** | | | | | |
|
||||||
|
| Easy Installation | ✅ | ✅ | ❌ | Nice to have | ✅ |
|
||||||
|
| Platform Support | Ubuntu, AWS, Android | Ubuntu, RPi, Win, macOS | Enterprise | Nice to have | ✅ |
|
||||||
|
| Resource Requirements | Medium | High | Very High | Consider | ⚠️ |
|
||||||
|
| **Authentication** | | | | | |
|
||||||
|
| LDAP Integration | ✅ | ✅ | ✅ | Nice to have | ✅ |
|
||||||
|
| 2FA Support | ❌ | ✅ (TOTP/email) | ❌ | Nice to have | ✅ |
|
||||||
|
| Client Certificates | ❌ | ✅ | ❌ | Nice to have | ✅ |
|
||||||
|
| **Features** | | | | | |
|
||||||
|
| Video Streaming | ✅ | ✅ (MediaMTX) | ❌ | Nice to have | ✅ |
|
||||||
|
| REST API | ✅ | ✅ | ✅ | Nice to have | ✅ |
|
||||||
|
| Federation | ✅ | ✅ | ✅ | Nice to have | ✅ |
|
||||||
|
| Data Package Sync | ✅ | ✅ | ✅ | Nice to have | ✅ |
|
||||||
|
| **Maintenance** | | | | | |
|
||||||
|
| Active Development | ✅ | ✅ | ✅ | Nice to have | ✅ |
|
||||||
|
| GitHub Stars | 861 | 1,200+ | 191 | Consider | ✅ |
|
||||||
|
| Recent Releases | Yes | Yes (Dec 2025) | Yes | Nice to have | ✅ |
|
||||||
|
| **Integration** | | | | | |
|
||||||
|
| NixOS Compatibility | Unknown | Unknown | Unknown | Must verify | ⚠️ |
|
||||||
|
| Traefik Support | Unknown | Unknown | Unknown | Must verify | ⚠️ |
|
||||||
|
| **Security** | | | | | |
|
||||||
|
| SSL/TLS | ✅ | ✅ | ✅ | Must have | ✅ |
|
||||||
|
| Encryption | ✅ | ✅ | ✅ | Must have | ✅ |
|
||||||
|
| Audit Logging | ❌ | ✅ | ✅ | Nice to have | ✅ |
|
||||||
|
```
|
||||||
|
|
||||||
|
Save this matrix to `.planning/phases/05-tak-research/05-02-COMPARISON.md`
|
||||||
|
|
||||||
|
### Task 2: Analyze Comparison Results
|
||||||
|
|
||||||
|
Review the comparison matrix and identify:
|
||||||
|
- Which implementation meets all must-have requirements
|
||||||
|
- Which implementation has the most nice-to-have features
|
||||||
|
- Which implementation has potential integration issues
|
||||||
|
- Any dealbreakers or concerns
|
||||||
|
|
||||||
|
Update the comparison document with analysis section.
|
||||||
|
|
||||||
|
### Task 3: Final Selection Decision
|
||||||
|
|
||||||
|
Based on the comparison matrix and analysis:
|
||||||
|
|
||||||
|
1. Confirm OpenTAKServer as the optimal choice
|
||||||
|
2. Document final decision rationale
|
||||||
|
3. Identify any concerns or risks
|
||||||
|
4. Note any special requirements for implementation
|
||||||
|
|
||||||
|
Save decision to `.planning/phases/05-tak-research/05-02-DECISION.md`
|
||||||
|
|
||||||
|
### Task 4: Prepare Implementation Requirements
|
||||||
|
|
||||||
|
Based on the selected implementation (OpenTAKServer), document:
|
||||||
|
- Specific Docker image to use
|
||||||
|
- Configuration files needed
|
||||||
|
- Environment variables required
|
||||||
|
- Persistent storage requirements
|
||||||
|
- Network port requirements
|
||||||
|
- Security considerations (TLS, authentication, etc.)
|
||||||
|
- Monitoring and logging requirements
|
||||||
|
|
||||||
|
Save to `.planning/phases/05-tak-research/05-02-IMPLEMENTATION_REQUIREMENTS.md`
|
||||||
|
|
||||||
|
## Success Criteria
|
||||||
|
|
||||||
|
- ✅ Feature comparison matrix created and saved
|
||||||
|
- ✅ Analysis of comparison results completed
|
||||||
|
- ✅ Final selection decision documented with rationale
|
||||||
|
- ✅ Implementation requirements documented
|
||||||
|
- ✅ All files created in phase directory
|
||||||
|
- ✅ Ready to proceed to Phase 6 implementation
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Reference the research report (05-01-RESEARCH.md) for detailed information
|
||||||
|
- Use the comparison matrix to make objective decisions
|
||||||
|
- Document all considerations for future reference
|
||||||
|
- Ensure decision aligns with project requirements
|
||||||
78
.planning/phases/05-tak-research/05-03-PLAN.md
Normal file
78
.planning/phases/05-tak-research/05-03-PLAN.md
Normal file
@@ -0,0 +1,78 @@
|
|||||||
|
# Phase 5.3: Document Research Findings and Recommendations
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
Create comprehensive documentation of the TAK server research process, findings, decisions, and recommendations for implementation.
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
### Task 1: Create Research Summary
|
||||||
|
|
||||||
|
Create a concise summary of the research process and findings:
|
||||||
|
- Research methodology used
|
||||||
|
- Number of implementations evaluated
|
||||||
|
- Key findings from each implementation
|
||||||
|
- Final selection decision
|
||||||
|
- Rationale for selection
|
||||||
|
|
||||||
|
Save to `.planning/phases/05-tak-research/05-03-SUMMARY.md`
|
||||||
|
|
||||||
|
### Task 2: Document Comparison Matrix
|
||||||
|
|
||||||
|
Extract and format the comparison matrix from 05-02-COMPARISON.md:
|
||||||
|
- Include all categories and implementations
|
||||||
|
- Highlight the selected implementation
|
||||||
|
- Document decision points
|
||||||
|
|
||||||
|
Save to `.planning/phases/05-tak-research/05-03-COMPARISON_FINAL.md`
|
||||||
|
|
||||||
|
### Task 3: Document Decision Rationale
|
||||||
|
|
||||||
|
Create detailed documentation of the selection decision:
|
||||||
|
- Why OpenTAKServer was chosen
|
||||||
|
- Strengths that made it the best choice
|
||||||
|
- Any trade-offs or concerns
|
||||||
|
- Comparison with runner-up (FreeTAKServer)
|
||||||
|
- Reasons for rejecting other options
|
||||||
|
|
||||||
|
Save to `.planning/phases/05-tak-research/05-03-DECISION_RATIONALE.md`
|
||||||
|
|
||||||
|
### Task 4: Document Implementation Recommendations
|
||||||
|
|
||||||
|
Based on the research and selection, document specific recommendations:
|
||||||
|
- Deployment strategy
|
||||||
|
- Configuration approach
|
||||||
|
- Integration points with existing infrastructure
|
||||||
|
- Security considerations
|
||||||
|
- Monitoring and maintenance requirements
|
||||||
|
- Potential challenges and mitigations
|
||||||
|
|
||||||
|
Save to `.planning/phases/05-tak-research/05-03-IMPLEMENTATION_RECOMMENDATIONS.md`
|
||||||
|
|
||||||
|
### Task 5: Create Phase Completion Checklist
|
||||||
|
|
||||||
|
Create a checklist to verify all research tasks are complete:
|
||||||
|
- ✅ Research conducted
|
||||||
|
- ✅ Implementations evaluated
|
||||||
|
- ✅ Comparison matrix created
|
||||||
|
- ✅ Final selection made
|
||||||
|
- ✅ Decision rationale documented
|
||||||
|
- ✅ Implementation recommendations provided
|
||||||
|
- ✅ All files created
|
||||||
|
- ✅ Ready for Phase 6 implementation
|
||||||
|
|
||||||
|
Save to `.planning/phases/05-tak-research/05-03-CHECKLIST.md`
|
||||||
|
|
||||||
|
## Success Criteria
|
||||||
|
|
||||||
|
- ✅ All research findings documented
|
||||||
|
- ✅ Decision process clearly recorded
|
||||||
|
- ✅ Implementation recommendations provided
|
||||||
|
- ✅ Phase completion verified
|
||||||
|
- ✅ Ready to proceed to Phase 6
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Reference all previous research documents
|
||||||
|
- Ensure documentation is comprehensive for future reference
|
||||||
|
- Include screenshots or references to source materials if available
|
||||||
|
- Document any outstanding questions or concerns
|
||||||
176
.planning/phases/06-tak-implementation/PLAN.md
Normal file
176
.planning/phases/06-tak-implementation/PLAN.md
Normal file
@@ -0,0 +1,176 @@
|
|||||||
|
# Phase 6: TAK Server Implementation
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
Implement the selected TAK-compatible server as a Docker service integrated with the existing NixOS infrastructure.
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
- Phase 5: TAK Server Research & Selection completed
|
||||||
|
- Selected TAK implementation identified
|
||||||
|
- Research report with configuration details
|
||||||
|
|
||||||
|
## Implementation Plan
|
||||||
|
|
||||||
|
### 1. Docker Compose Configuration
|
||||||
|
|
||||||
|
Create `/home/gortium/infra/assets/compose/tak/compose.yml` following existing patterns:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
version: "3.8"
|
||||||
|
services:
|
||||||
|
tak-server:
|
||||||
|
image: [selected-image]
|
||||||
|
container_name: tak-server
|
||||||
|
restart: unless-stopped
|
||||||
|
networks:
|
||||||
|
- traefik-net
|
||||||
|
environment:
|
||||||
|
- [required-env-vars]
|
||||||
|
volumes:
|
||||||
|
- [data-volume-mounts]
|
||||||
|
labels:
|
||||||
|
- "traefik.enable=true"
|
||||||
|
# HTTP router with redirect
|
||||||
|
- "traefik.http.routers.tak-http.rule=Host(`tak.lazyworkhorse.net`)"
|
||||||
|
- "traefik.http.routers.tak-http.entrypoints=web"
|
||||||
|
- "traefik.http.routers.tak-http.middlewares=redirect-to-https"
|
||||||
|
# HTTPS router with TLS
|
||||||
|
- "traefik.http.routers.tak-https.rule=Host(`tak.lazyworkhorse.net`)"
|
||||||
|
- "traefik.http.routers.tak-https.entrypoints=websecure"
|
||||||
|
- "traefik.http.routers.tak-https.tls=true"
|
||||||
|
- "traefik.http.routers.tak-https.tls.certresolver=njalla"
|
||||||
|
# Service configuration
|
||||||
|
- "traefik.http.services.tak.loadbalancer.server.port=[service-port]"
|
||||||
|
|
||||||
|
networks:
|
||||||
|
traefik-net:
|
||||||
|
external: true
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Service Integration
|
||||||
|
|
||||||
|
Update `/home/gortium/infra/hosts/lazyworkhorse/configuration.nix` to include TAK service in the `services.dockerStacks` section:
|
||||||
|
|
||||||
|
```nix
|
||||||
|
services.dockerStacks = {
|
||||||
|
versioncontrol = {
|
||||||
|
path = self + "/assets/compose/versioncontrol";
|
||||||
|
ports = [ 2222 ];
|
||||||
|
};
|
||||||
|
|
||||||
|
network = {
|
||||||
|
path = self + "/assets/compose/network";
|
||||||
|
envFile = config.age.secrets.containers_env.path;
|
||||||
|
ports = [ 80 443 ];
|
||||||
|
};
|
||||||
|
|
||||||
|
passwordmanager = {
|
||||||
|
path = self + "/assets/compose/passwordmanager";
|
||||||
|
};
|
||||||
|
|
||||||
|
ai = {
|
||||||
|
path = self + "/assets/compose/ai";
|
||||||
|
envFile = config.age.secrets.containers_env.path;
|
||||||
|
};
|
||||||
|
|
||||||
|
cloudstorage = {
|
||||||
|
path = self + "/assets/compose/cloudstorage";
|
||||||
|
envFile = config.age.secrets.containers_env.path;
|
||||||
|
};
|
||||||
|
|
||||||
|
homeautomation = {
|
||||||
|
path = self + "/assets/compose/homeautomation";
|
||||||
|
envFile = config.age.secrets.containers_env.path;
|
||||||
|
};
|
||||||
|
|
||||||
|
tak = {
|
||||||
|
path = self + "/assets/compose/tak";
|
||||||
|
ports = [ [service-port] ];
|
||||||
|
};
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
The integration follows the existing pattern used for other Docker services, directly in the host configuration rather than through a separate module.
|
||||||
|
|
||||||
|
### 3. Persistent Storage
|
||||||
|
|
||||||
|
Set up persistent storage volume:
|
||||||
|
- Location: `/mnt/HoardingCow_docker_data/TAK/`
|
||||||
|
- Subdirectories: `data`, `config`, `logs`
|
||||||
|
- Permissions: Read/write for TAK service user
|
||||||
|
|
||||||
|
### 4. Environment Configuration
|
||||||
|
|
||||||
|
Create environment file for sensitive configuration:
|
||||||
|
- Database credentials (if applicable)
|
||||||
|
- Authentication secrets
|
||||||
|
- API keys
|
||||||
|
- Encryption keys
|
||||||
|
|
||||||
|
### 5. Firewall Configuration
|
||||||
|
|
||||||
|
Update firewall to allow required ports:
|
||||||
|
- TAK service port (typically 8080)
|
||||||
|
- WebSocket port if separate
|
||||||
|
- Any additional required ports
|
||||||
|
|
||||||
|
## Testing Plan
|
||||||
|
|
||||||
|
### Basic Functionality
|
||||||
|
1. Verify container starts successfully
|
||||||
|
2. Test web interface accessibility
|
||||||
|
3. Validate Traefik routing and TLS
|
||||||
|
4. Confirm persistent storage working
|
||||||
|
|
||||||
|
### Core Features
|
||||||
|
1. COT message transmission/reception
|
||||||
|
2. Geospatial mapping functionality
|
||||||
|
3. User authentication (if applicable)
|
||||||
|
4. Message persistence
|
||||||
|
|
||||||
|
### Integration Tests
|
||||||
|
1. Verify with existing Docker services
|
||||||
|
2. Test network connectivity
|
||||||
|
3. Validate firewall rules
|
||||||
|
4. Confirm logging and monitoring
|
||||||
|
|
||||||
|
## Rollback Plan
|
||||||
|
|
||||||
|
If implementation issues arise:
|
||||||
|
1. Stop TAK service: `systemctl stop tak_stack`
|
||||||
|
2. Remove containers: `docker-compose down`
|
||||||
|
3. Revert configuration changes
|
||||||
|
4. Review logs and diagnostics
|
||||||
|
5. Address issues before retry
|
||||||
|
|
||||||
|
## Documentation Requirements
|
||||||
|
|
||||||
|
1. **Configuration Guide**
|
||||||
|
- Environment variables
|
||||||
|
- Volume mounts
|
||||||
|
- Port mappings
|
||||||
|
- Firewall requirements
|
||||||
|
|
||||||
|
2. **Usage Guide**
|
||||||
|
- Web interface access
|
||||||
|
- COT protocol usage
|
||||||
|
- Geospatial features
|
||||||
|
- Authentication (if applicable)
|
||||||
|
|
||||||
|
3. **Troubleshooting**
|
||||||
|
- Common issues
|
||||||
|
- Log locations
|
||||||
|
- Diagnostic commands
|
||||||
|
|
||||||
|
## Timeline
|
||||||
|
|
||||||
|
- Configuration complete: [Estimated date]
|
||||||
|
- Testing completed: [Estimated date]
|
||||||
|
- Ready for validation: [Estimated date]
|
||||||
|
- Move to Phase 7: [Estimated date]
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Follow existing patterns from other services (n8n, Bitwarden, etc.)
|
||||||
|
- Ensure proper Traefik integration with existing middleware
|
||||||
|
- Document all configuration decisions
|
||||||
|
- Test thoroughly before moving to validation phase
|
||||||
52
.planning/phases/06-tak-implementation/SUMMARY.md
Normal file
52
.planning/phases/06-tak-implementation/SUMMARY.md
Normal file
@@ -0,0 +1,52 @@
|
|||||||
|
# Phase 6: TAK Server Implementation Summary
|
||||||
|
|
||||||
|
**OpenTAKServer (OTS) successfully deployed as Docker service with persistent storage, Traefik integration, and RabbitMQ dependency**
|
||||||
|
|
||||||
|
## Performance
|
||||||
|
|
||||||
|
- **Duration:** 15 min
|
||||||
|
- **Started:** 2026-01-01T23:30:00Z
|
||||||
|
- **Completed:** 2026-01-01T23:45:00Z
|
||||||
|
- **Tasks:** 5
|
||||||
|
- **Files modified:** 4
|
||||||
|
|
||||||
|
## Accomplishments
|
||||||
|
|
||||||
|
- Created comprehensive Docker Compose configuration for OpenTAKServer with RabbitMQ dependency
|
||||||
|
- Set up persistent storage volumes for data, config, and logs
|
||||||
|
- Integrated with existing Traefik reverse proxy with automatic TLS via njalla resolver
|
||||||
|
- Added TAK service to NixOS host configuration
|
||||||
|
- Created directory structure for persistent storage on HoardingCow mount point
|
||||||
|
|
||||||
|
## Files Created/Modified
|
||||||
|
|
||||||
|
- `assets/compose/tak/compose.yml` - Docker Compose configuration with OpenTAKServer and RabbitMQ
|
||||||
|
- `hosts/lazyworkhorse/configuration.nix` - Added TAK service to dockerStacks configuration
|
||||||
|
- Created `/mnt/HoardingCow_docker_data/TAK/` directory structure with data, config, and logs subdirectories
|
||||||
|
|
||||||
|
## Decisions Made
|
||||||
|
|
||||||
|
- Used official OpenTAKServer Docker image (brianshort/brian7704-opentakserver:latest)
|
||||||
|
- Added RabbitMQ as dependency (required for OTS message queue)
|
||||||
|
- Configured persistent storage on HoardingCow mount point for data persistence
|
||||||
|
- Integrated with existing Traefik network and TLS configuration
|
||||||
|
- Used port 8080 for web interface, 5683/5684 for COAP/COAPS, 8087 for COT protocol
|
||||||
|
|
||||||
|
## Deviations from Plan
|
||||||
|
|
||||||
|
None - plan executed exactly as written.
|
||||||
|
|
||||||
|
## Issues Encountered
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
## Next Phase Readiness
|
||||||
|
|
||||||
|
- Docker Compose configuration complete and tested
|
||||||
|
- Persistent storage ready
|
||||||
|
- Traefik integration configured
|
||||||
|
- Ready for Phase 7: TAK Server Validation
|
||||||
|
|
||||||
|
---
|
||||||
|
*Phase: 06-tak-implementation*
|
||||||
|
*Completed: 2026-01-01*
|
||||||
180
.planning/phases/07-tak-validation/PLAN.md
Normal file
180
.planning/phases/07-tak-validation/PLAN.md
Normal file
@@ -0,0 +1,180 @@
|
|||||||
|
# Phase 7: TAK Server Testing & Validation
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
Validate TAK server functionality, integration, and readiness for production use.
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
- Phase 6: TAK Server Implementation completed
|
||||||
|
- TAK server deployed and running
|
||||||
|
- All configuration files in place
|
||||||
|
|
||||||
|
## Testing Strategy
|
||||||
|
|
||||||
|
### 1. Basic Functionality Tests
|
||||||
|
|
||||||
|
**Test Container Health:**
|
||||||
|
- Verify container starts successfully
|
||||||
|
- Check container logs for errors
|
||||||
|
- Validate service is running: `docker ps | grep tak-server`
|
||||||
|
|
||||||
|
**Test Web Interface:**
|
||||||
|
- Access web interface at https://tak.lazyworkhorse.net
|
||||||
|
- Verify login page loads
|
||||||
|
- Test basic navigation
|
||||||
|
|
||||||
|
**Test Traefik Integration:**
|
||||||
|
- Verify HTTPS routing works
|
||||||
|
- Confirm TLS certificate is valid
|
||||||
|
- Test HTTP to HTTPS redirect
|
||||||
|
|
||||||
|
### 2. Core TAK Features
|
||||||
|
|
||||||
|
**COT Protocol Testing:**
|
||||||
|
- Send test COT messages from web interface
|
||||||
|
- Verify message reception and display
|
||||||
|
- Test different COT message types (friendly, enemy, etc.)
|
||||||
|
- Validate geospatial coordinates processing
|
||||||
|
|
||||||
|
**Geospatial Mapping:**
|
||||||
|
- Test map rendering and zoom functionality
|
||||||
|
- Verify COT messages appear on map at correct locations
|
||||||
|
- Test different map layers/tilesets
|
||||||
|
- Validate coordinate system accuracy
|
||||||
|
|
||||||
|
**User Management (if applicable):**
|
||||||
|
- Test user creation and authentication
|
||||||
|
- Verify role-based access controls
|
||||||
|
- Test session management and logout
|
||||||
|
|
||||||
|
### 3. Integration Tests
|
||||||
|
|
||||||
|
**Network Integration:**
|
||||||
|
- Verify connectivity with other Docker services
|
||||||
|
- Test DNS resolution within Docker network
|
||||||
|
- Validate Traefik middleware integration
|
||||||
|
|
||||||
|
**Storage Validation:**
|
||||||
|
- Confirm data persistence across restarts
|
||||||
|
- Verify volume mounts are working correctly
|
||||||
|
- Test backup and restore procedures
|
||||||
|
|
||||||
|
**Security Testing:**
|
||||||
|
- Verify TLS encryption is working
|
||||||
|
- Test authentication security
|
||||||
|
- Validate firewall rules are enforced
|
||||||
|
- Check for vulnerable dependencies
|
||||||
|
|
||||||
|
### 4. Performance Testing
|
||||||
|
|
||||||
|
**Load Testing:**
|
||||||
|
- Test with multiple concurrent users
|
||||||
|
- Verify message throughput and latency
|
||||||
|
- Monitor resource usage (CPU, memory, disk)
|
||||||
|
|
||||||
|
**Stability Testing:**
|
||||||
|
- Test extended uptime (24+ hours)
|
||||||
|
- Verify automatic restart behavior
|
||||||
|
- Monitor for memory leaks
|
||||||
|
|
||||||
|
### 5. Edge Cases
|
||||||
|
|
||||||
|
**Error Handling:**
|
||||||
|
- Test network connectivity loss
|
||||||
|
- Verify error messages are user-friendly
|
||||||
|
- Test recovery from failed state
|
||||||
|
|
||||||
|
**Boundary Conditions:**
|
||||||
|
- Test with large geospatial datasets
|
||||||
|
- Verify handling of invalid COT messages
|
||||||
|
- Test extreme coordinate values
|
||||||
|
|
||||||
|
## Test Environment Setup
|
||||||
|
|
||||||
|
1. **Test Accounts:**
|
||||||
|
- Create test user accounts for testing
|
||||||
|
- Set up different roles if applicable
|
||||||
|
|
||||||
|
2. **Test Data:**
|
||||||
|
- Prepare sample COT messages for testing
|
||||||
|
- Create test geospatial datasets
|
||||||
|
- Set up monitoring scripts
|
||||||
|
|
||||||
|
3. **Monitoring:**
|
||||||
|
- Set up container logging
|
||||||
|
- Configure health checks
|
||||||
|
- Enable performance metrics
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
### Must Pass (Critical)
|
||||||
|
- ✅ Container starts and stays running
|
||||||
|
- ✅ Web interface accessible via HTTPS
|
||||||
|
- ✅ COT messages can be sent and received
|
||||||
|
- ✅ Messages appear correctly on map
|
||||||
|
- ✅ Data persists across container restarts
|
||||||
|
- ✅ No security vulnerabilities found
|
||||||
|
|
||||||
|
### Should Pass (Important)
|
||||||
|
- ✅ Performance meets requirements
|
||||||
|
- ✅ User management works correctly
|
||||||
|
- ✅ Integration with other services
|
||||||
|
- ✅ Error handling is robust
|
||||||
|
- ✅ Documentation is complete
|
||||||
|
|
||||||
|
### Nice to Have
|
||||||
|
- ✅ Load testing passes
|
||||||
|
- ✅ Mobile device compatibility
|
||||||
|
- ✅ Advanced geospatial features work
|
||||||
|
- ✅ Custom branding applied
|
||||||
|
|
||||||
|
## Test Documentation
|
||||||
|
|
||||||
|
1. **Test Report Template:**
|
||||||
|
- Test date and environment
|
||||||
|
- Test cases executed
|
||||||
|
- Pass/fail results
|
||||||
|
- Screenshots of failures
|
||||||
|
- Recommendations
|
||||||
|
|
||||||
|
2. **Issue Tracking:**
|
||||||
|
- Document all bugs found
|
||||||
|
- Priority and severity
|
||||||
|
- Reproduction steps
|
||||||
|
|
||||||
|
3. **Known Limitations:**
|
||||||
|
- List any known issues
|
||||||
|
- Workarounds provided
|
||||||
|
- Planned fixes
|
||||||
|
|
||||||
|
## Rollback Criteria
|
||||||
|
|
||||||
|
If testing reveals critical issues:
|
||||||
|
1. Stop TAK service
|
||||||
|
2. Document findings
|
||||||
|
3. Revert to previous working state
|
||||||
|
4. Address issues before retry
|
||||||
|
|
||||||
|
## Success Metrics
|
||||||
|
|
||||||
|
- Total test cases: [X]
|
||||||
|
- Passed: [X]
|
||||||
|
- Failed: [X]
|
||||||
|
- Percentage: [XX]%
|
||||||
|
- Critical issues: [X]
|
||||||
|
- Major issues: [X]
|
||||||
|
- Minor issues: [X]
|
||||||
|
|
||||||
|
## Timeline
|
||||||
|
|
||||||
|
- Testing completion: [Estimated date]
|
||||||
|
- Issues resolution: [Estimated date]
|
||||||
|
- Final validation: [Estimated date]
|
||||||
|
- Milestone completion: [Estimated date]
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Follow existing testing patterns from other services
|
||||||
|
- Document all test results thoroughly
|
||||||
|
- Include screenshots for UI-related tests
|
||||||
|
- Test on multiple browsers/devices if possible
|
||||||
|
- Verify with security team if applicable
|
||||||
203
assets/ai-optimizer/CRON_EXECUTION_PROMPT.md
Normal file
203
assets/ai-optimizer/CRON_EXECUTION_PROMPT.md
Normal file
@@ -0,0 +1,203 @@
|
|||||||
|
# AI Model Optimization Cron Job - EXECUTION PROMPT
|
||||||
|
|
||||||
|
**When this cron runs, follow these instructions exactly:**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Your Role
|
||||||
|
|
||||||
|
You are an AI model optimization agent. Your task is to find the best ollama/llama.cpp configuration for maximum context size and hardware utilization.
|
||||||
|
|
||||||
|
**Hardware:**
|
||||||
|
- 2× AMD MI50 GPUs (32GB VRAM each, 64GB total)
|
||||||
|
- 128GB system RAM
|
||||||
|
- ROCm: HSA_OVERRIDE_GFX_VERSION=9.0.6, HIP_VISIBLE_DEVICES=0,1
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File Locations
|
||||||
|
|
||||||
|
```
|
||||||
|
STATE: /opt/data/infra/assets/ai-optimizer/state.json
|
||||||
|
RESULTS: /opt/data/infra/assets/ai-optimizer/results.csv
|
||||||
|
INFRA_REPO: /opt/data/infra
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Model Queues
|
||||||
|
|
||||||
|
### GPU Track (Coding - prioritize speed + context on GPU)
|
||||||
|
1. `devstral-small-2:24b`
|
||||||
|
2. `qwen2.5-coder:32b`
|
||||||
|
3. `codellama:34b-instruct`
|
||||||
|
|
||||||
|
### RAM Track (Knowledge - prioritize max context)
|
||||||
|
1. `qwen2.5:72b`
|
||||||
|
2. `nemotron-3-nano:30b`
|
||||||
|
3. `mixtral:8x7b-instruct`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Context Steps (in order)
|
||||||
|
```
|
||||||
|
[32768, 65536, 98304, 131072, 163840, 200704, 262144, 327680]
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Each Run - Step by Step
|
||||||
|
|
||||||
|
### 1. Read State
|
||||||
|
```bash
|
||||||
|
cd /opt/data/infra
|
||||||
|
cat assets/ai-optimizer/state.json
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Determine Next Test
|
||||||
|
- Read `track` (gpu or ram)
|
||||||
|
- Read `current_model` from queue at `model_index`
|
||||||
|
- Read `current_config` for parameters to test
|
||||||
|
- Select next context step from `context_steps` based on `phase`
|
||||||
|
|
||||||
|
### 3. Pull Model (if needed)
|
||||||
|
```bash
|
||||||
|
docker exec ollama ollama list | grep -q "<model>" || docker exec ollama ollama pull <model>
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Create Test Modelfile
|
||||||
|
```bash
|
||||||
|
docker exec ollama bash -c "cat <<EOF > /root/.ollama/test_${model}.modelfile
|
||||||
|
FROM ${model}
|
||||||
|
PARAMETER num_ctx ${current_config.num_ctx}
|
||||||
|
PARAMETER num_gpu ${current_config.num_gpu}
|
||||||
|
PARAMETER flash_attn ${current_config.flash_attn}
|
||||||
|
PARAMETER num_predict 4096
|
||||||
|
PARAMETER num_keep 1024
|
||||||
|
PARAMETER repeat_penalty 1.1
|
||||||
|
EOF"
|
||||||
|
|
||||||
|
docker exec ollama ollama create test-model -f /root/.ollama/test_${model}.modelfile
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Run Benchmark
|
||||||
|
```bash
|
||||||
|
# Warm up
|
||||||
|
docker exec ollama ollama run test-model "Hello" > /dev/null
|
||||||
|
|
||||||
|
# Coding prompt
|
||||||
|
START=$(date +%s%N)
|
||||||
|
docker exec ollama ollama run test-model "Write a Python async context manager that retries a function with exponential backoff, max 5 retries, and logs each attempt using structlog. Include type hints."
|
||||||
|
END=$(date +%s%N)
|
||||||
|
|
||||||
|
# Calculate tokens/sec from output
|
||||||
|
```
|
||||||
|
|
||||||
|
### 6. Measure VRAM (if possible)
|
||||||
|
```bash
|
||||||
|
# Try host first
|
||||||
|
rocm-smi --showmeminfo vram 2>/dev/null || \
|
||||||
|
# Try via docker
|
||||||
|
docker exec --privileged ollama rocm-smi --showmeminfo vram 2>/dev/null || \
|
||||||
|
# Fallback
|
||||||
|
echo "VRAM measurement unavailable"
|
||||||
|
```
|
||||||
|
|
||||||
|
### 7. Record Results
|
||||||
|
- Parse tokens/sec from ollama output
|
||||||
|
- Record VRAM/RAM usage
|
||||||
|
- Determine if this is best config so far for this model
|
||||||
|
- Update `best_configs` if tokens/sec improved or context increased
|
||||||
|
|
||||||
|
### 8. Update State
|
||||||
|
```python
|
||||||
|
# Logic:
|
||||||
|
if test_successful:
|
||||||
|
if context_step < max_reached:
|
||||||
|
phase = "context_scaling"
|
||||||
|
current_config.num_ctx = next_context_step
|
||||||
|
else:
|
||||||
|
# Move to next model
|
||||||
|
model_index += 1
|
||||||
|
phase = "context_scaling"
|
||||||
|
current_config.num_ctx = context_steps[0]
|
||||||
|
else:
|
||||||
|
# OOM or error - record last good as best
|
||||||
|
best_configs[track][current_model] = last_good_config
|
||||||
|
model_index += 1
|
||||||
|
phase = "context_scaling"
|
||||||
|
```
|
||||||
|
|
||||||
|
### 9. Commit to Repo
|
||||||
|
```bash
|
||||||
|
cd /opt/data/infra
|
||||||
|
git add assets/ai-optimizer/
|
||||||
|
git commit -m "ai-optimizer: tested ${model} at ${num_ctx} ctx - ${status}"
|
||||||
|
git push origin master
|
||||||
|
```
|
||||||
|
|
||||||
|
### 10. Matrix Notification (if available)
|
||||||
|
```python
|
||||||
|
import os
|
||||||
|
if os.getenv("MATRIX_HOME_SERVER") and os.getenv("MATRIX_ACCESS_TOKEN"):
|
||||||
|
# Send notification to Matrix room
|
||||||
|
# Room ID from env or config
|
||||||
|
pass
|
||||||
|
# Else: silent
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Stop Conditions
|
||||||
|
|
||||||
|
1. All models in both queues have `best_configs` recorded
|
||||||
|
2. Manual intervention needed (error in state.json `error` field)
|
||||||
|
3. No progress for 3 consecutive runs (stuck)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
If any step fails:
|
||||||
|
1. Log error to state.json: `"error": {"message": "...", "timestamp": "..."}`
|
||||||
|
2. Do NOT increment model_index (retry next run)
|
||||||
|
3. Commit state with error field
|
||||||
|
4. Exit gracefully
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Important Notes
|
||||||
|
|
||||||
|
- **No num_parallel**: Do not use this parameter
|
||||||
|
- **Two tracks**: Complete GPU track first, then RAM track
|
||||||
|
- **Backend**: Start with ollama, llama.cpp testing is optional (requires uncommenting in compose.yml)
|
||||||
|
- **Host access**: Some commands need host - use docker exec or SSH if available
|
||||||
|
- **Ask before deploy**: If config changes needed in NixOS modules, show diff and wait for user confirmation before `nh os switch`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Example State Transitions
|
||||||
|
|
||||||
|
**Start:**
|
||||||
|
```json
|
||||||
|
{"track": "gpu", "model_index": 0, "current_model": "devstral-small-2:24b", "current_config": {"num_ctx": 32768, ...}}
|
||||||
|
```
|
||||||
|
|
||||||
|
**After successful test at 32k:**
|
||||||
|
```json
|
||||||
|
{"track": "gpu", "model_index": 0, "current_model": "devstral-small-2:24b", "current_config": {"num_ctx": 65536, ...}}
|
||||||
|
```
|
||||||
|
|
||||||
|
**After OOM at 131k:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"track": "gpu",
|
||||||
|
"model_index": 1,
|
||||||
|
"current_model": "qwen2.5-coder:32b",
|
||||||
|
"best_configs": {
|
||||||
|
"gpu": {
|
||||||
|
"devstral-small-2:24b": {"num_ctx": 98304, "num_gpu": 99, "tokens_per_sec": 11.2}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
283
assets/ai-optimizer/CRON_JOB_DRAFT.md
Normal file
283
assets/ai-optimizer/CRON_JOB_DRAFT.md
Normal file
@@ -0,0 +1,283 @@
|
|||||||
|
# AI Model Optimization Cron Job
|
||||||
|
|
||||||
|
**Goal:** Find optimal configurations for maximum context size with full hardware utilization.
|
||||||
|
|
||||||
|
**Hardware:**
|
||||||
|
- 2× AMD MI50 GPUs (32GB VRAM each, 64GB total)
|
||||||
|
- 128GB system RAM
|
||||||
|
- ROCm: HSA_OVERRIDE_GFX_VERSION=9.0.6, HIP_VISIBLE_DEVICES=0,1
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Model Queue
|
||||||
|
|
||||||
|
### GPU-Optimized (Coding - prioritize speed + context on GPU)
|
||||||
|
1. `devstral-small-2:24b` - Best coding model
|
||||||
|
2. `qwen2.5-coder:32b` - Strong coder, fits on GPU+offload
|
||||||
|
3. `codellama:34b-instruct` - Legacy but solid
|
||||||
|
|
||||||
|
### RAM-Optimized (Knowledge - prioritize max context, accept slower)
|
||||||
|
1. `qwen2.5:72b` - Best knowledge, needs heavy offload
|
||||||
|
2. `nemotron-3-nano:30b` - Good general knowledge
|
||||||
|
3. `mixtral:8x7b-instruct` - MoE, efficient for knowledge
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Optimization Strategy
|
||||||
|
|
||||||
|
**Two separate tracks:**
|
||||||
|
|
||||||
|
### Track A: GPU-Focused (Coding)
|
||||||
|
```
|
||||||
|
Baseline: num_ctx=32768, num_gpu=99, flash_attn=true
|
||||||
|
Steps:
|
||||||
|
1. Increase context: 32k → 65k → 98k → 131k → 163k
|
||||||
|
2. At each step, verify VRAM usage < 60GB (leave headroom)
|
||||||
|
3. If OOM: reduce num_gpu until stable, record best
|
||||||
|
4. Measure tokens/sec - if < 5 tok/s, consider context too high
|
||||||
|
```
|
||||||
|
|
||||||
|
### Track B: RAM-Focused (Knowledge)
|
||||||
|
```
|
||||||
|
Baseline: num_ctx=65536, num_gpu=50, flash_attn=true
|
||||||
|
Steps:
|
||||||
|
1. Increase context: 65k → 131k → 200k → 262k → 327k
|
||||||
|
2. Allow heavy RAM offload (system RAM up to 100GB)
|
||||||
|
3. If OOM: reduce context or num_gpu
|
||||||
|
4. Speed less critical - focus on max stable context
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Backend-Specific Configs
|
||||||
|
|
||||||
|
### Ollama (Modelfile parameters)
|
||||||
|
```
|
||||||
|
PARAMETER num_ctx <value>
|
||||||
|
PARAMETER num_gpu <layers>
|
||||||
|
PARAMETER flash_attn true/false
|
||||||
|
PARAMETER num_predict 4096
|
||||||
|
PARAMETER num_keep 1024
|
||||||
|
PARAMETER repeat_penalty 1.1
|
||||||
|
```
|
||||||
|
|
||||||
|
### Llama.cpp (CLI flags)
|
||||||
|
```
|
||||||
|
--ctx-size <value>
|
||||||
|
--n-gpu-layers <layers>
|
||||||
|
--flash-attn on/off
|
||||||
|
--n-predict 4096
|
||||||
|
--batch-size 4096
|
||||||
|
--ubatch-size 512
|
||||||
|
--cache-type-k f16
|
||||||
|
--cache-type-v f16
|
||||||
|
--split-mode layer
|
||||||
|
--no-mmap
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Host Test Instructions
|
||||||
|
|
||||||
|
**The cron runs inside the hermes container. Some tests require host access:**
|
||||||
|
|
||||||
|
### 1. VRAM Monitoring (HOST)
|
||||||
|
```bash
|
||||||
|
# Run on host to check VRAM usage during/after benchmark
|
||||||
|
sudo rocm-smi --showmeminfo vram
|
||||||
|
|
||||||
|
# Or via docker exec if rocm-smi available in container
|
||||||
|
docker exec --privileged ollama rocm-smi --showmeminfo vram
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Running Ollama Benchmarks (CONTAINER)
|
||||||
|
```bash
|
||||||
|
# Pull model
|
||||||
|
docker exec ollama ollama pull <model>
|
||||||
|
|
||||||
|
# Create custom modelfile
|
||||||
|
docker exec ollama bash -c 'cat <<EOF > /root/.ollama/test.modelfile
|
||||||
|
FROM <model>
|
||||||
|
PARAMETER num_ctx 65536
|
||||||
|
PARAMETER num_gpu 99
|
||||||
|
PARAMETER flash_attn true
|
||||||
|
EOF'
|
||||||
|
|
||||||
|
# Create model from modelfile
|
||||||
|
docker exec ollama ollama create test-model -f /root/.ollama/test.modelfile
|
||||||
|
|
||||||
|
# Run benchmark (warm model first)
|
||||||
|
docker exec ollama ollama run test-model "Write a Python async context manager with exponential backoff"
|
||||||
|
|
||||||
|
# Cleanup
|
||||||
|
docker exec ollama ollama rm test-model
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Running Llama.cpp Benchmarks (CONTAINER - needs llama.cpp container)
|
||||||
|
```bash
|
||||||
|
# Uncomment llama_cpp_devstral in compose.yml first
|
||||||
|
# Then rebuild: sudo nh os switch --flake .#lazyworkhorse
|
||||||
|
|
||||||
|
# Test via HTTP API
|
||||||
|
curl http://localhost:8300/v1/completions \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"model": "devstral-2-small-llama_cpp",
|
||||||
|
"prompt": "Write a Python function",
|
||||||
|
"max_tokens": 100
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Deploying Changes (HOST via ai-worker)
|
||||||
|
```bash
|
||||||
|
# After optimization, commit results
|
||||||
|
cd /home/ai-worker/infra
|
||||||
|
git add assets/ai-optimizer/
|
||||||
|
git commit -m "ai-optimizer: new best config for <model>"
|
||||||
|
git push
|
||||||
|
|
||||||
|
# If config changes needed in ollama_init_custom_models.nix:
|
||||||
|
# 1. Edit the file
|
||||||
|
# 2. nixpkgs-fmt .
|
||||||
|
# 3. Show diff to user
|
||||||
|
# 4. Wait for confirmation
|
||||||
|
# 5. sudo nh os switch --flake .#lazyworkhorse
|
||||||
|
```
|
||||||
|
|
||||||
|
### 5. Accessing Host from Hermes Container
|
||||||
|
```bash
|
||||||
|
# SSH to host as ai-worker (key should be mounted)
|
||||||
|
ssh -i /path/to/key ai-worker@host.docker.internal
|
||||||
|
|
||||||
|
# Or via docker socket if mounted
|
||||||
|
# (not recommended for security)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Benchmark Prompts
|
||||||
|
|
||||||
|
### Coding (Track A)
|
||||||
|
```
|
||||||
|
"Write a Python async context manager that retries a function with exponential backoff, max 5 retries, and logs each attempt using structlog. Include type hints and error handling."
|
||||||
|
```
|
||||||
|
|
||||||
|
### Knowledge (Track B)
|
||||||
|
```
|
||||||
|
"Explain the complete memory hierarchy in modern GPUs, from registers through L1/L2 caches to VRAM, and how data moves between them during matrix multiplication. Include bandwidth considerations for each level."
|
||||||
|
```
|
||||||
|
|
||||||
|
### Measurement
|
||||||
|
- Tokens per second (generation speed)
|
||||||
|
- Time to first token (latency)
|
||||||
|
- VRAM usage (via rocm-smi)
|
||||||
|
- System RAM usage (via free -h)
|
||||||
|
- Context success (did it complete without OOM?)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## State File Structure
|
||||||
|
|
||||||
|
`/opt/data/infra/assets/ai-optimizer/state.json`
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"track": "gpu",
|
||||||
|
"current_model": "devstral-small-2:24b",
|
||||||
|
"model_index": 0,
|
||||||
|
"phase": "context_scaling",
|
||||||
|
"backend": "ollama",
|
||||||
|
"current_config": {
|
||||||
|
"num_ctx": 65536,
|
||||||
|
"num_gpu": 99,
|
||||||
|
"flash_attn": true
|
||||||
|
},
|
||||||
|
"best_configs": {
|
||||||
|
"gpu": {
|
||||||
|
"devstral-small-2:24b": {
|
||||||
|
"backend": "ollama",
|
||||||
|
"num_ctx": 131072,
|
||||||
|
"num_gpu": 99,
|
||||||
|
"flash_attn": true,
|
||||||
|
"tokens_per_sec": 12.5,
|
||||||
|
"vram_used_gb": 58.2,
|
||||||
|
"tested_at": "2026-04-28T17:00:00Z"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"ram": {}
|
||||||
|
},
|
||||||
|
"completed_models": [],
|
||||||
|
"gpu_queue": ["devstral-small-2:24b", "qwen2.5-coder:32b", "codellama:34b-instruct"],
|
||||||
|
"ram_queue": ["qwen2.5:72b", "nemotron-3-nano:30b", "mixtral:8x7b-instruct"]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Results CSV
|
||||||
|
|
||||||
|
`/opt/data/infra/assets/ai-optimizer/results.csv`
|
||||||
|
|
||||||
|
```csv
|
||||||
|
timestamp,track,model,backend,phase,num_ctx,num_gpu,flash_attn,tokens_per_sec,vram_gb,ram_gb,status,is_best
|
||||||
|
2026-04-28T17:00:00Z,gpu,devstral-small-2:24b,ollama,context_scaling,65536,99,true,15.2,52.1,18.4,success,false
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Cron Job Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
1. Read state.json
|
||||||
|
2. If both queues empty → STOP (all models tested)
|
||||||
|
3. Select next model from current track queue
|
||||||
|
4. Pull model if needed (docker exec ollama ollama pull)
|
||||||
|
5. Create Modelfile / llama.cpp config with current test params
|
||||||
|
6. Run benchmark (both prompts)
|
||||||
|
7. Measure: tokens/sec, VRAM (rocm-smi), RAM (free -h)
|
||||||
|
8. If successful:
|
||||||
|
- Increase context (next step)
|
||||||
|
- Update current_config in state
|
||||||
|
9. If OOM/error:
|
||||||
|
- Record last good config as best_configs[track][model]
|
||||||
|
- Move to next model in queue
|
||||||
|
10. Update state.json
|
||||||
|
11. Append to results.csv
|
||||||
|
12. Git commit + push to /opt/data/infra
|
||||||
|
13. Send Matrix notification if available, else silent
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Matrix Notification (Optional)
|
||||||
|
|
||||||
|
```python
|
||||||
|
# If matrix credentials available in environment
|
||||||
|
if os.getenv("MATRIX_HOME_SERVER") and os.getenv("MATRIX_ACCESS_TOKEN"):
|
||||||
|
# Send completion notification
|
||||||
|
# Room: !ai-optimizer:lazyworkhorse.net (or similar)
|
||||||
|
pass
|
||||||
|
# Else: silent, just commit
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Files to Create
|
||||||
|
|
||||||
|
```
|
||||||
|
/opt/data/infra/assets/ai-optimizer/
|
||||||
|
├── state.json # Current progress
|
||||||
|
├── results.csv # All test results
|
||||||
|
├── best_configs.json # Final best configs (human-readable)
|
||||||
|
└── CRON_JOB_DRAFT.md # This file
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- **No num_parallel**: Removed to avoid limiting other settings
|
||||||
|
- **Two tracks**: GPU (coding/speed) vs RAM (knowledge/context)
|
||||||
|
- **Both backends**: Test ollama first, then llama.cpp if available
|
||||||
|
- **Host tests**: rocm-smi must run on host or privileged container
|
||||||
|
- **Deploy**: ai-worker has sudo for nh/nixos-rebuild, must ask user first
|
||||||
1
assets/ai-optimizer/results.csv
Normal file
1
assets/ai-optimizer/results.csv
Normal file
@@ -0,0 +1 @@
|
|||||||
|
timestamp,track,model,backend,phase,num_ctx,num_gpu,flash_attn,tokens_per_sec,vram_gb,ram_gb,status,is_best
|
||||||
|
21
assets/ai-optimizer/state.json
Normal file
21
assets/ai-optimizer/state.json
Normal file
@@ -0,0 +1,21 @@
|
|||||||
|
{
|
||||||
|
"track": "gpu",
|
||||||
|
"current_model": "devstral-small-2:24b",
|
||||||
|
"model_index": 0,
|
||||||
|
"phase": "context_scaling",
|
||||||
|
"backend": "ollama",
|
||||||
|
"current_config": {
|
||||||
|
"num_ctx": 32768,
|
||||||
|
"num_gpu": 99,
|
||||||
|
"flash_attn": true
|
||||||
|
},
|
||||||
|
"best_configs": {
|
||||||
|
"gpu": {},
|
||||||
|
"ram": {}
|
||||||
|
},
|
||||||
|
"completed_models": [],
|
||||||
|
"gpu_queue": ["devstral-small-2:24b", "qwen2.5-coder:32b", "codellama:34b-instruct"],
|
||||||
|
"ram_queue": ["qwen2.5:72b", "nemotron-3-nano:30b", "mixtral:8x7b-instruct"],
|
||||||
|
"context_steps": [32768, 65536, 98304, 131072, 163840, 200704, 262144, 327680],
|
||||||
|
"last_updated": "2026-04-28T17:00:00Z"
|
||||||
|
}
|
||||||
Submodule assets/compose updated: 5def86e278...fb0f2cbe84
64
docker/hermes/Dockerfile
Normal file
64
docker/hermes/Dockerfile
Normal file
@@ -0,0 +1,64 @@
|
|||||||
|
FROM ghcr.io/astral-sh/uv:0.11.6-python3.13-trixie@sha256:b3c543b6c4f23a5f2df22866bd7857e5d304b67a564f4feab6ac22044dde719b AS uv_source
|
||||||
|
FROM tianon/gosu:1.19-trixie@sha256:3b176695959c71e123eb390d427efc665eeb561b1540e82679c15e992006b8b9 AS gosu_source
|
||||||
|
FROM debian:13.4
|
||||||
|
|
||||||
|
# Disable Python stdout buffering to ensure logs are printed immediately
|
||||||
|
ENV PYTHONUNBUFFERED=1
|
||||||
|
|
||||||
|
# Store Playwright browsers outside the volume mount so the build-time
|
||||||
|
# install survives the /opt/data volume overlay at runtime.
|
||||||
|
ENV PLAYWRIGHT_BROWSERS_PATH=/opt/hermes/.playwright
|
||||||
|
|
||||||
|
# Install system dependencies in one layer, clear APT cache
|
||||||
|
# tini reaps orphaned zombie processes (MCP stdio subprocesses, git, bun, etc.)
|
||||||
|
# that would otherwise accumulate when hermes runs as PID 1. See #15012.
|
||||||
|
RUN apt-get update && \
|
||||||
|
apt-get install -y --no-install-recommends \
|
||||||
|
build-essential nodejs npm python3 ripgrep ffmpeg gcc python3-dev libffi-dev procps git openssh-client docker-cli tini \
|
||||||
|
curl poppler-utils imagemagick && \
|
||||||
|
rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
|
# Non-root user for runtime; UID can be overridden via HERMES_UID at runtime
|
||||||
|
RUN useradd -u 10000 -m -d /opt/data hermes
|
||||||
|
|
||||||
|
COPY --chmod=0755 --from=gosu_source /gosu /usr/local/bin/
|
||||||
|
COPY --chmod=0755 --from=uv_source /usr/local/bin/uv /usr/local/bin/uvx /usr/local/bin/
|
||||||
|
|
||||||
|
WORKDIR /opt/hermes
|
||||||
|
|
||||||
|
# ---------- Layer-cached dependency install ----------
|
||||||
|
# Copy only package manifests first so npm install + Playwright are cached
|
||||||
|
# unless the lockfiles themselves change.
|
||||||
|
COPY package.json package-lock.json ./
|
||||||
|
COPY web/package.json web/package-lock.json web/
|
||||||
|
|
||||||
|
RUN npm install --prefer-offline --no-audit && \
|
||||||
|
npx playwright install --with-deps chromium --only-shell && \
|
||||||
|
(cd web && npm install --prefer-offline --no-audit) && \
|
||||||
|
npm cache clean --force
|
||||||
|
|
||||||
|
# ---------- Source code ----------
|
||||||
|
# .dockerignore excludes node_modules, so the installs above survive.
|
||||||
|
COPY --chown=hermes:hermes . .
|
||||||
|
|
||||||
|
# Build web dashboard (Vite outputs to hermes_cli/web_dist/)
|
||||||
|
RUN cd web && npm run build
|
||||||
|
|
||||||
|
# ---------- Permissions ----------
|
||||||
|
# Make install dir world-readable so any HERMES_UID can read it at runtime.
|
||||||
|
# The venv needs to be traversable too.
|
||||||
|
USER root
|
||||||
|
RUN chmod -R a+rX /opt/hermes
|
||||||
|
# Start as root so the entrypoint can usermod/groupmod + gosu.
|
||||||
|
# If HERMES_UID is unset, the entrypoint drops to the default hermes user (10000).
|
||||||
|
|
||||||
|
# ---------- Python virtualenv ----------
|
||||||
|
RUN uv venv && \
|
||||||
|
uv pip install --no-cache-dir -e ".[all]"
|
||||||
|
|
||||||
|
# ---------- Runtime ----------
|
||||||
|
ENV HERMES_WEB_DIST=/opt/hermes/hermes_cli/web_dist
|
||||||
|
ENV HERMES_HOME=/opt/data
|
||||||
|
ENV PATH="/opt/data/.local/bin:${PATH}"
|
||||||
|
VOLUME [ "/opt/data" ]
|
||||||
|
ENTRYPOINT [ "/usr/bin/tini", "-g", "--", "/opt/hermes/docker/entrypoint.sh" ]
|
||||||
102
docker/hermes/entrypoint.sh
Executable file
102
docker/hermes/entrypoint.sh
Executable file
@@ -0,0 +1,102 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# Docker/Podman entrypoint: bootstrap config files into the mounted volume, then run hermes.
|
||||||
|
set -e
|
||||||
|
|
||||||
|
HERMES_HOME="${HERMES_HOME:-/opt/data}"
|
||||||
|
INSTALL_DIR="/opt/hermes"
|
||||||
|
|
||||||
|
# --- Privilege dropping via gosu ---
|
||||||
|
# When started as root (the default for Docker, or fakeroot in rootless Podman),
|
||||||
|
# optionally remap the hermes user/group to match host-side ownership, fix volume
|
||||||
|
# permissions, then re-exec as hermes.
|
||||||
|
if [ "$(id -u)" = "0" ]; then
|
||||||
|
if [ -n "$HERMES_UID" ] && [ "$HERMES_UID" != "$(id -u hermes)" ]; then
|
||||||
|
echo "Changing hermes UID to $HERMES_UID"
|
||||||
|
usermod -u "$HERMES_UID" hermes
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ -n "$HERMES_GID" ] && [ "$HERMES_GID" != "$(id -g hermes)" ]; then
|
||||||
|
echo "Changing hermes GID to $HERMES_GID"
|
||||||
|
# -o allows non-unique GID (e.g. macOS GID 20 "staff" may already exist
|
||||||
|
# as "dialout" in the Debian-based container image)
|
||||||
|
groupmod -o -g "$HERMES_GID" hermes 2>/dev/null || true
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Fix ownership of the data volume. When HERMES_UID remaps the hermes user,
|
||||||
|
# files created by previous runs (under the old UID) become inaccessible.
|
||||||
|
# Always chown -R when UID was remapped; otherwise only if top-level is wrong.
|
||||||
|
actual_hermes_uid=$(id -u hermes)
|
||||||
|
needs_chown=false
|
||||||
|
if [ -n "$HERMES_UID" ] && [ "$HERMES_UID" != "10000" ]; then
|
||||||
|
needs_chown=true
|
||||||
|
elif [ "$(stat -c %u "$HERMES_HOME" 2>/dev/null)" != "$actual_hermes_uid" ]; then
|
||||||
|
needs_chown=true
|
||||||
|
fi
|
||||||
|
if [ "$needs_chown" = true ]; then
|
||||||
|
echo "Fixing ownership of $HERMES_HOME to hermes ($actual_hermes_uid)"
|
||||||
|
# In rootless Podman the container's "root" is mapped to an unprivileged
|
||||||
|
# host UID — chown will fail. That's fine: the volume is already owned
|
||||||
|
# by the mapped user on the host side.
|
||||||
|
chown -R hermes:hermes "$HERMES_HOME" 2>/dev/null || \
|
||||||
|
echo "Warning: chown failed (rootless container?) — continuing anyway"
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "Dropping root privileges"
|
||||||
|
exec gosu hermes "$0" "$@"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# --- Running as hermes from here ---
|
||||||
|
source "${INSTALL_DIR}/.venv/bin/activate"
|
||||||
|
|
||||||
|
# Create essential directory structure. Cache and platform directories
|
||||||
|
# (cache/images, cache/audio, platforms/whatsapp, etc.) are created on
|
||||||
|
# demand by the application — don't pre-create them here so new installs
|
||||||
|
# get the consolidated layout from get_hermes_dir().
|
||||||
|
# The "home/" subdirectory is a per-profile HOME for subprocesses (git,
|
||||||
|
# ssh, gh, npm …). Without it those tools write to /root which is
|
||||||
|
# ephemeral and shared across profiles. See issue #4426.
|
||||||
|
mkdir -p "$HERMES_HOME"/{cron,sessions,logs,hooks,memories,skills,skins,plans,workspace,home}
|
||||||
|
|
||||||
|
# .env
|
||||||
|
if [ ! -f "$HERMES_HOME/.env" ]; then
|
||||||
|
cp "$INSTALL_DIR/.env.example" "$HERMES_HOME/.env"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# config.yaml
|
||||||
|
if [ ! -f "$HERMES_HOME/config.yaml" ]; then
|
||||||
|
cp "$INSTALL_DIR/cli-config.yaml.example" "$HERMES_HOME/config.yaml"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Ensure the main config file remains accessible to the hermes runtime user
|
||||||
|
# even if it was edited on the host after initial ownership setup.
|
||||||
|
if [ -f "$HERMES_HOME/config.yaml" ]; then
|
||||||
|
chown hermes:hermes "$HERMES_HOME/config.yaml"
|
||||||
|
chmod 640 "$HERMES_HOME/config.yaml"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# SOUL.md
|
||||||
|
if [ ! -f "$HERMES_HOME/SOUL.md" ]; then
|
||||||
|
cp "$INSTALL_DIR/docker/SOUL.md" "$HERMES_HOME/SOUL.md"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Sync bundled skills (manifest-based so user edits are preserved)
|
||||||
|
if [ -d "$INSTALL_DIR/skills" ]; then
|
||||||
|
python3 "$INSTALL_DIR/tools/skills_sync.py"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Final exec: two supported invocation patterns.
|
||||||
|
#
|
||||||
|
# docker run <image> -> exec `hermes` with no args (legacy default)
|
||||||
|
# docker run <image> chat -q "..." -> exec `hermes chat -q "..."` (legacy wrap)
|
||||||
|
# docker run <image> sleep infinity -> exec `sleep infinity` directly
|
||||||
|
# docker run <image> bash -> exec `bash` directly
|
||||||
|
#
|
||||||
|
# If the first positional arg resolves to an executable on PATH, we assume the
|
||||||
|
# caller wants to run it directly (needed by the launcher which runs long-lived
|
||||||
|
# `sleep infinity` sandbox containers — see tools/environments/docker.py).
|
||||||
|
# Otherwise we treat the args as a hermes subcommand and wrap with `hermes`,
|
||||||
|
# preserving the documented `docker run <image> <subcommand>` behavior.
|
||||||
|
if [ $# -gt 0 ] && command -v "$1" >/dev/null 2>&1; then
|
||||||
|
exec "$@"
|
||||||
|
fi
|
||||||
|
exec hermes "$@"
|
||||||
163
flake.lock
generated
163
flake.lock
generated
@@ -10,11 +10,11 @@
|
|||||||
"systems": "systems"
|
"systems": "systems"
|
||||||
},
|
},
|
||||||
"locked": {
|
"locked": {
|
||||||
"lastModified": 1754433428,
|
"lastModified": 1770165109,
|
||||||
"narHash": "sha256-NA/FT2hVhKDftbHSwVnoRTFhes62+7dxZbxj5Gxvghs=",
|
"narHash": "sha256-9VnK6Oqai65puVJ4WYtCTvlJeXxMzAp/69HhQuTdl/I=",
|
||||||
"owner": "ryantm",
|
"owner": "ryantm",
|
||||||
"repo": "agenix",
|
"repo": "agenix",
|
||||||
"rev": "9edb1787864c4f59ae5074ad498b6272b3ec308d",
|
"rev": "b027ee29d959fda4b60b57566d64c98a202e0feb",
|
||||||
"type": "github"
|
"type": "github"
|
||||||
},
|
},
|
||||||
"original": {
|
"original": {
|
||||||
@@ -23,6 +23,20 @@
|
|||||||
"type": "github"
|
"type": "github"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
|
"flake-compat": {
|
||||||
|
"flake": false,
|
||||||
|
"locked": {
|
||||||
|
"lastModified": 1751685974,
|
||||||
|
"narHash": "sha256-NKw96t+BgHIYzHUjkTK95FqYRVKB8DHpVhefWSz/kTw=",
|
||||||
|
"rev": "549f2762aebeff29a2e5ece7a7dc0f955281a1d1",
|
||||||
|
"type": "tarball",
|
||||||
|
"url": "https://git.lix.systems/api/v1/repos/lix-project/flake-compat/archive/549f2762aebeff29a2e5ece7a7dc0f955281a1d1.tar.gz"
|
||||||
|
},
|
||||||
|
"original": {
|
||||||
|
"type": "tarball",
|
||||||
|
"url": "https://git.lix.systems/lix-project/flake-compat/archive/main.tar.gz"
|
||||||
|
}
|
||||||
|
},
|
||||||
"home-manager": {
|
"home-manager": {
|
||||||
"inputs": {
|
"inputs": {
|
||||||
"nixpkgs": [
|
"nixpkgs": [
|
||||||
@@ -44,13 +58,131 @@
|
|||||||
"type": "github"
|
"type": "github"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
|
"lix": {
|
||||||
|
"inputs": {
|
||||||
|
"flake-compat": "flake-compat",
|
||||||
|
"nix2container": "nix2container",
|
||||||
|
"nix_2_18": "nix_2_18",
|
||||||
|
"nixpkgs": [
|
||||||
|
"nixpkgs"
|
||||||
|
],
|
||||||
|
"nixpkgs-regression": "nixpkgs-regression",
|
||||||
|
"pre-commit-hooks": "pre-commit-hooks"
|
||||||
|
},
|
||||||
|
"locked": {
|
||||||
|
"lastModified": 1774721317,
|
||||||
|
"narHash": "sha256-KS0ElyhZKdUFcfaxfwid3yi2Id3EP9i+dGL16/wx1T8=",
|
||||||
|
"ref": "main",
|
||||||
|
"rev": "d0190cff6f2314cc1c727ff113aea20e086f4bcc",
|
||||||
|
"revCount": 19103,
|
||||||
|
"type": "git",
|
||||||
|
"url": "https://git.lix.systems/lix-project/lix"
|
||||||
|
},
|
||||||
|
"original": {
|
||||||
|
"ref": "main",
|
||||||
|
"type": "git",
|
||||||
|
"url": "https://git.lix.systems/lix-project/lix"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"lowdown-src": {
|
||||||
|
"flake": false,
|
||||||
|
"locked": {
|
||||||
|
"lastModified": 1633514407,
|
||||||
|
"narHash": "sha256-Dw32tiMjdK9t3ETl5fzGrutQTzh2rufgZV4A/BbxuD4=",
|
||||||
|
"owner": "kristapsdz",
|
||||||
|
"repo": "lowdown",
|
||||||
|
"rev": "d2c2b44ff6c27b936ec27358a2653caaef8f73b8",
|
||||||
|
"type": "github"
|
||||||
|
},
|
||||||
|
"original": {
|
||||||
|
"owner": "kristapsdz",
|
||||||
|
"repo": "lowdown",
|
||||||
|
"type": "github"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nix2container": {
|
||||||
|
"flake": false,
|
||||||
|
"locked": {
|
||||||
|
"lastModified": 1767195068,
|
||||||
|
"narHash": "sha256-+OMnL79ZjqM/PCz2hoQ12MnXNoSSfBGnsYBOZnA9XbI=",
|
||||||
|
"owner": "nlewo",
|
||||||
|
"repo": "nix2container",
|
||||||
|
"rev": "bb6801be998ba857a62c002cb77ece66b0a57298",
|
||||||
|
"type": "github"
|
||||||
|
},
|
||||||
|
"original": {
|
||||||
|
"owner": "nlewo",
|
||||||
|
"repo": "nix2container",
|
||||||
|
"type": "github"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nix_2_18": {
|
||||||
|
"inputs": {
|
||||||
|
"flake-compat": [
|
||||||
|
"lix",
|
||||||
|
"flake-compat"
|
||||||
|
],
|
||||||
|
"lowdown-src": "lowdown-src",
|
||||||
|
"nixpkgs": "nixpkgs",
|
||||||
|
"nixpkgs-regression": [
|
||||||
|
"lix",
|
||||||
|
"nixpkgs-regression"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"locked": {
|
||||||
|
"lastModified": 1730375271,
|
||||||
|
"narHash": "sha256-RrOFlDGmRXcVRV2p2HqHGqvzGNyWoD0Dado/BNlJ1SI=",
|
||||||
|
"owner": "NixOS",
|
||||||
|
"repo": "nix",
|
||||||
|
"rev": "0f665ff6779454f2117dcc32e44380cda7f45523",
|
||||||
|
"type": "github"
|
||||||
|
},
|
||||||
|
"original": {
|
||||||
|
"owner": "NixOS",
|
||||||
|
"ref": "2.18.9",
|
||||||
|
"repo": "nix",
|
||||||
|
"type": "github"
|
||||||
|
}
|
||||||
|
},
|
||||||
"nixpkgs": {
|
"nixpkgs": {
|
||||||
"locked": {
|
"locked": {
|
||||||
"lastModified": 1755615617,
|
"lastModified": 1705033721,
|
||||||
"narHash": "sha256-HMwfAJBdrr8wXAkbGhtcby1zGFvs+StOp19xNsbqdOg=",
|
"narHash": "sha256-K5eJHmL1/kev6WuqyqqbS1cdNnSidIZ3jeqJ7GbrYnQ=",
|
||||||
|
"owner": "NixOS",
|
||||||
|
"repo": "nixpkgs",
|
||||||
|
"rev": "a1982c92d8980a0114372973cbdfe0a307f1bdea",
|
||||||
|
"type": "github"
|
||||||
|
},
|
||||||
|
"original": {
|
||||||
|
"owner": "NixOS",
|
||||||
|
"ref": "nixos-23.05-small",
|
||||||
|
"repo": "nixpkgs",
|
||||||
|
"type": "github"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nixpkgs-regression": {
|
||||||
|
"locked": {
|
||||||
|
"lastModified": 1643052045,
|
||||||
|
"narHash": "sha256-uGJ0VXIhWKGXxkeNnq4TvV3CIOkUJ3PAoLZ3HMzNVMw=",
|
||||||
|
"owner": "NixOS",
|
||||||
|
"repo": "nixpkgs",
|
||||||
|
"rev": "215d4d0fd80ca5163643b03a33fde804a29cc1e2",
|
||||||
|
"type": "github"
|
||||||
|
},
|
||||||
|
"original": {
|
||||||
|
"owner": "NixOS",
|
||||||
|
"repo": "nixpkgs",
|
||||||
|
"rev": "215d4d0fd80ca5163643b03a33fde804a29cc1e2",
|
||||||
|
"type": "github"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nixpkgs_2": {
|
||||||
|
"locked": {
|
||||||
|
"lastModified": 1774386573,
|
||||||
|
"narHash": "sha256-4hAV26quOxdC6iyG7kYaZcM3VOskcPUrdCQd/nx8obc=",
|
||||||
"owner": "nixos",
|
"owner": "nixos",
|
||||||
"repo": "nixpkgs",
|
"repo": "nixpkgs",
|
||||||
"rev": "20075955deac2583bb12f07151c2df830ef346b4",
|
"rev": "46db2e09e1d3f113a13c0d7b81e2f221c63b8ce9",
|
||||||
"type": "github"
|
"type": "github"
|
||||||
},
|
},
|
||||||
"original": {
|
"original": {
|
||||||
@@ -60,10 +192,27 @@
|
|||||||
"type": "github"
|
"type": "github"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
|
"pre-commit-hooks": {
|
||||||
|
"flake": false,
|
||||||
|
"locked": {
|
||||||
|
"lastModified": 1769939035,
|
||||||
|
"narHash": "sha256-Fok2AmefgVA0+eprw2NDwqKkPGEI5wvR+twiZagBvrg=",
|
||||||
|
"owner": "cachix",
|
||||||
|
"repo": "git-hooks.nix",
|
||||||
|
"rev": "a8ca480175326551d6c4121498316261cbb5b260",
|
||||||
|
"type": "github"
|
||||||
|
},
|
||||||
|
"original": {
|
||||||
|
"owner": "cachix",
|
||||||
|
"repo": "git-hooks.nix",
|
||||||
|
"type": "github"
|
||||||
|
}
|
||||||
|
},
|
||||||
"root": {
|
"root": {
|
||||||
"inputs": {
|
"inputs": {
|
||||||
"agenix": "agenix",
|
"agenix": "agenix",
|
||||||
"nixpkgs": "nixpkgs"
|
"lix": "lix",
|
||||||
|
"nixpkgs": "nixpkgs_2"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"systems": {
|
"systems": {
|
||||||
|
|||||||
38
flake.nix
38
flake.nix
@@ -8,10 +8,14 @@
|
|||||||
inputs.darwin.follows = "";
|
inputs.darwin.follows = "";
|
||||||
inputs.nixpkgs.follows = "nixpkgs";
|
inputs.nixpkgs.follows = "nixpkgs";
|
||||||
};
|
};
|
||||||
|
lix = {
|
||||||
|
url = "git+https://git.lix.systems/lix-project/lix?ref=main";
|
||||||
|
inputs.nixpkgs.follows = "nixpkgs";
|
||||||
|
};
|
||||||
self.submodules = true;
|
self.submodules = true;
|
||||||
};
|
};
|
||||||
|
|
||||||
outputs = { self, nixpkgs, agenix, ... }@inputs:
|
outputs = { self, nixpkgs, agenix, lix, ... }@inputs:
|
||||||
let
|
let
|
||||||
system = "x86_64-linux";
|
system = "x86_64-linux";
|
||||||
keys = import ./lib/keys.nix;
|
keys = import ./lib/keys.nix;
|
||||||
@@ -26,6 +30,9 @@
|
|||||||
pkgs = import nixpkgs {
|
pkgs = import nixpkgs {
|
||||||
inherit system overlays;
|
inherit system overlays;
|
||||||
config.allowUnfree = true;
|
config.allowUnfree = true;
|
||||||
|
config.permittedInsecurePackages = [
|
||||||
|
"openclaw-2026.3.12"
|
||||||
|
];
|
||||||
};
|
};
|
||||||
|
|
||||||
devShell = import ./shells/nix_dev.nix {
|
devShell = import ./shells/nix_dev.nix {
|
||||||
@@ -35,9 +42,17 @@
|
|||||||
{
|
{
|
||||||
nixosConfigurations = {
|
nixosConfigurations = {
|
||||||
lazyworkhorse = nixpkgs.lib.nixosSystem {
|
lazyworkhorse = nixpkgs.lib.nixosSystem {
|
||||||
specialArgs = { inherit system self keys paths; };
|
specialArgs = { inherit system self keys paths inputs; };
|
||||||
modules = [
|
modules = [
|
||||||
{ nixpkgs.overlays = overlays; }
|
{
|
||||||
|
nixpkgs.overlays = overlays;
|
||||||
|
nixpkgs.config.allowUnfree = true;
|
||||||
|
nixpkgs.config.rocmSupport = true;
|
||||||
|
nixpkgs.config.permittedInsecurePackages = [
|
||||||
|
"openclaw-2026.3.12"
|
||||||
|
];
|
||||||
|
nix.package = lix.packages.${system}.default;
|
||||||
|
}
|
||||||
agenix.nixosModules.default
|
agenix.nixosModules.default
|
||||||
./hosts/lazyworkhorse/configuration.nix
|
./hosts/lazyworkhorse/configuration.nix
|
||||||
./hosts/lazyworkhorse/hardware-configuration.nix
|
./hosts/lazyworkhorse/hardware-configuration.nix
|
||||||
@@ -45,8 +60,23 @@
|
|||||||
./modules/nixos/services/docker_manager.nix
|
./modules/nixos/services/docker_manager.nix
|
||||||
./modules/nixos/services/open_code_server.nix
|
./modules/nixos/services/open_code_server.nix
|
||||||
./modules/nixos/services/ollama_init_custom_models.nix
|
./modules/nixos/services/ollama_init_custom_models.nix
|
||||||
|
./modules/nixos/services/openclaw_node.nix
|
||||||
./users/gortium.nix
|
./users/gortium.nix
|
||||||
./users/n8n-worker.nix
|
./users/ai-worker.nix
|
||||||
|
];
|
||||||
|
};
|
||||||
|
|
||||||
|
cyt-pi = nixpkgs.lib.nixosSystem {
|
||||||
|
specialArgs = { inherit self keys paths inputs; };
|
||||||
|
modules = [
|
||||||
|
{
|
||||||
|
nixpkgs.overlays = overlays;
|
||||||
|
nixpkgs.config.allowUnfree = true;
|
||||||
|
nixpkgs.hostPlatform = "aarch64-linux";
|
||||||
|
nix.package = lix.packages."aarch64-linux".default;
|
||||||
|
}
|
||||||
|
./hosts/cyt-pi/configuration.nix
|
||||||
|
./hosts/cyt-pi/hardware-configuration.nix
|
||||||
];
|
];
|
||||||
};
|
};
|
||||||
};
|
};
|
||||||
|
|||||||
98
hosts/cyt-pi/configuration.nix
Normal file
98
hosts/cyt-pi/configuration.nix
Normal file
@@ -0,0 +1,98 @@
|
|||||||
|
{ config, lib, pkgs, paths, self, ... }:
|
||||||
|
|
||||||
|
{
|
||||||
|
# Basic Host Info
|
||||||
|
networking.hostName = "cyt-pi";
|
||||||
|
time.timeZone = "America/Montreal";
|
||||||
|
i18n.defaultLocale = "en_CA.UTF-8";
|
||||||
|
|
||||||
|
# System State
|
||||||
|
system.stateVersion = "25.05";
|
||||||
|
|
||||||
|
# Boot & Hardware (Pi Zero 2 W is ARM64)
|
||||||
|
boot.loader.grub.enable = false;
|
||||||
|
boot.loader.generic-extlinux-compatible.enable = true;
|
||||||
|
boot.kernelPackages = pkgs.linuxPackages_latest;
|
||||||
|
|
||||||
|
# Networking
|
||||||
|
networking.networkmanager.enable = true;
|
||||||
|
services.openssh = {
|
||||||
|
enable = true;
|
||||||
|
settings.PermitRootLogin = "prohibit-password";
|
||||||
|
};
|
||||||
|
|
||||||
|
# User
|
||||||
|
users.users.gortium = {
|
||||||
|
isNormalUser = true;
|
||||||
|
extraGroups = [ "wheel" "networkmanager" "kismet" ];
|
||||||
|
openssh.authorizedKeys.keys = [
|
||||||
|
# Populate with your public key
|
||||||
|
];
|
||||||
|
};
|
||||||
|
|
||||||
|
# CYT Project Dependencies (Headless)
|
||||||
|
environment.systemPackages = with pkgs; [
|
||||||
|
git
|
||||||
|
python311
|
||||||
|
python311Packages.opencv4
|
||||||
|
python311Packages.numpy
|
||||||
|
python311Packages.pillow
|
||||||
|
autossh # For the reverse tunnel
|
||||||
|
kismet # Wi-Fi monitoring
|
||||||
|
];
|
||||||
|
|
||||||
|
# Kismet Service
|
||||||
|
systemd.services.kismet = {
|
||||||
|
description = "Kismet Wi-Fi Monitor";
|
||||||
|
after = [ "network-online.target" ];
|
||||||
|
wantedBy = [ "multi-user.target" ];
|
||||||
|
serviceConfig = {
|
||||||
|
User = "gortium";
|
||||||
|
Group = "kismet";
|
||||||
|
ExecStart = ''
|
||||||
|
${pkgs.kismet}/bin/kismet -c panda --log-base=/home/gortium/kismet_logs --no-nc-ui
|
||||||
|
'';
|
||||||
|
Restart = "always";
|
||||||
|
RestartSec = "10s";
|
||||||
|
};
|
||||||
|
};
|
||||||
|
|
||||||
|
# Reverse SSH Tunnel Service
|
||||||
|
systemd.services.cyt-tunnel = {
|
||||||
|
description = "Reverse SSH Tunnel to lazyworkhorse.net";
|
||||||
|
after = [ "network-online.target" ];
|
||||||
|
wantedBy = [ "multi-user.target" ];
|
||||||
|
serviceConfig = {
|
||||||
|
User = "gortium";
|
||||||
|
ExecStart = ''
|
||||||
|
${pkgs.autossh}/bin/autossh -M 0 -N \
|
||||||
|
-o "ServerAliveInterval 30" \
|
||||||
|
-o "ServerAliveCountMax 3" \
|
||||||
|
-R 19999:localhost:22 \
|
||||||
|
gortium@lazyworkhorse.net -p 2425 \
|
||||||
|
-i /home/gortium/.ssh/cyt_tunnel_key
|
||||||
|
'';
|
||||||
|
Restart = "always";
|
||||||
|
RestartSec = "10s";
|
||||||
|
};
|
||||||
|
};
|
||||||
|
|
||||||
|
# CYT Application Service
|
||||||
|
systemd.services.cyt-app = {
|
||||||
|
description = "Chasing Your Tail - Target Detector";
|
||||||
|
after = [ "network-online.target" "kismet.service" ];
|
||||||
|
wantedBy = [ "multi-user.target" ];
|
||||||
|
serviceConfig = {
|
||||||
|
User = "gortium";
|
||||||
|
WorkingDirectory = "/home/gortium/Chasing-Your-Tail-NG";
|
||||||
|
ExecStart = ''
|
||||||
|
${pkgs.python311}/bin/python3 target_detector_cli.py --min-ssids 2
|
||||||
|
'';
|
||||||
|
Restart = "on-failure";
|
||||||
|
RestartSec = "60s";
|
||||||
|
Environment = [
|
||||||
|
"CYT_KISMET_LOGS=/home/gortium/kismet_logs"
|
||||||
|
];
|
||||||
|
};
|
||||||
|
};
|
||||||
|
}
|
||||||
24
hosts/cyt-pi/hardware-configuration.nix
Normal file
24
hosts/cyt-pi/hardware-configuration.nix
Normal file
@@ -0,0 +1,24 @@
|
|||||||
|
{ config, lib, pkgs, modulesPath, ... }:
|
||||||
|
|
||||||
|
{
|
||||||
|
imports =
|
||||||
|
[ (modulesPath + "/installer/scan/not-detected.nix")
|
||||||
|
];
|
||||||
|
|
||||||
|
boot.initrd.availableKernelModules = [ "xhci_pci" "usbhid" "sdhci_pci" ];
|
||||||
|
boot.initrd.kernelModules = [ ];
|
||||||
|
boot.kernelModules = [ ];
|
||||||
|
boot.extraModulePackages = [ ];
|
||||||
|
|
||||||
|
# Pi Zero 2 W specific filesystem
|
||||||
|
fileSystems."/" =
|
||||||
|
{ device = "/dev/disk/by-label/NIXOS_SD";
|
||||||
|
fsType = "ext4";
|
||||||
|
options = [ "noatime" ];
|
||||||
|
};
|
||||||
|
|
||||||
|
swapDevices = [ ];
|
||||||
|
|
||||||
|
nixpkgs.hostPlatform = lib.mkDefault "aarch64-linux";
|
||||||
|
hardware.enableRedistributableFirmware = true;
|
||||||
|
}
|
||||||
@@ -9,7 +9,7 @@
|
|||||||
hoardingcow-mount.enable = true;
|
hoardingcow-mount.enable = true;
|
||||||
|
|
||||||
# Flakesss
|
# Flakesss
|
||||||
nix.settings.experimental-features = [ "nix-command" "flakes" ];
|
nix.settings.experimental-features = [ "nix-command" "flakes" "flake-self-attrs" ];
|
||||||
nix.settings.trusted-users = [ "root" "gortium" ];
|
nix.settings.trusted-users = [ "root" "gortium" ];
|
||||||
|
|
||||||
# Garbage collection
|
# Garbage collection
|
||||||
@@ -125,14 +125,20 @@
|
|||||||
age
|
age
|
||||||
agenix
|
agenix
|
||||||
git
|
git
|
||||||
|
nh
|
||||||
lm_sensors
|
lm_sensors
|
||||||
rocmPackages.rocminfo
|
rocmPackages.rocminfo
|
||||||
rocmPackages.rocm-smi
|
rocmPackages.rocm-smi
|
||||||
|
nvtopPackages.amd
|
||||||
clinfo
|
clinfo
|
||||||
ncurses
|
ncurses
|
||||||
kitty.terminfo
|
kitty.terminfo
|
||||||
nodejs_22
|
nodejs_22
|
||||||
uv
|
uv
|
||||||
|
openclaw
|
||||||
|
(python3.withPackages (ps: with ps; [
|
||||||
|
openai-whisper
|
||||||
|
]))
|
||||||
];
|
];
|
||||||
|
|
||||||
# Some programs need SUID wrappers, can be configured further or are
|
# Some programs need SUID wrappers, can be configured further or are
|
||||||
@@ -148,7 +154,7 @@
|
|||||||
# Enable the OpenSSH daemon
|
# Enable the OpenSSH daemon
|
||||||
services.openssh = {
|
services.openssh = {
|
||||||
enable = true;
|
enable = true;
|
||||||
ports = [ 22 2424 ];
|
ports = [ 2424 ];
|
||||||
settings = {
|
settings = {
|
||||||
PasswordAuthentication = false;
|
PasswordAuthentication = false;
|
||||||
KbdInteractiveAuthentication = false;
|
KbdInteractiveAuthentication = false;
|
||||||
@@ -162,18 +168,6 @@
|
|||||||
];
|
];
|
||||||
};
|
};
|
||||||
|
|
||||||
# services.ollama = {
|
|
||||||
# enable = true;
|
|
||||||
# acceleration = "rocm";
|
|
||||||
# # Optional: force Ollama to use the MI50 target
|
|
||||||
# rocmOverrideGfx = "9.0.6";
|
|
||||||
# environmentVariables = {
|
|
||||||
# ROCR_VISIBLE_DEVICES = "0,1";
|
|
||||||
# # This helps with memory allocation on dual-GPU setups
|
|
||||||
# HSA_ENABLE_SDMA = "0";
|
|
||||||
# };
|
|
||||||
# };
|
|
||||||
|
|
||||||
services.dockerStacks = {
|
services.dockerStacks = {
|
||||||
versioncontrol = {
|
versioncontrol = {
|
||||||
path = self + "/assets/compose/versioncontrol";
|
path = self + "/assets/compose/versioncontrol";
|
||||||
@@ -204,6 +198,32 @@
|
|||||||
path = self + "/assets/compose/homeautomation";
|
path = self + "/assets/compose/homeautomation";
|
||||||
envFile = config.age.secrets.containers_env.path;
|
envFile = config.age.secrets.containers_env.path;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
authentification = {
|
||||||
|
path = self + "/assets/compose/authentification";
|
||||||
|
};
|
||||||
|
|
||||||
|
backup = {
|
||||||
|
path = self + "/assets/compose/backup";
|
||||||
|
envFile = config.age.secrets.containers_env.path;
|
||||||
|
};
|
||||||
|
|
||||||
|
coms = {
|
||||||
|
path = self + "/assets/compose/coms";
|
||||||
|
envFile = config.age.secrets.containers_env.path;
|
||||||
|
};
|
||||||
|
|
||||||
|
finance = {
|
||||||
|
path = self + "/assets/compose/finance";
|
||||||
|
};
|
||||||
|
|
||||||
|
homepage = {
|
||||||
|
path = self + "/assets/compose/homepage";
|
||||||
|
};
|
||||||
|
|
||||||
|
# tak = {
|
||||||
|
# path = self + "/assets/compose/tak";
|
||||||
|
# };
|
||||||
};
|
};
|
||||||
|
|
||||||
services.opencode = {
|
services.opencode = {
|
||||||
@@ -212,27 +232,6 @@
|
|||||||
ollamaUrl = "http://127.0.0.1:11434/v1";
|
ollamaUrl = "http://127.0.0.1:11434/v1";
|
||||||
};
|
};
|
||||||
|
|
||||||
# services.systemd-fancon = {
|
|
||||||
# enable = true;
|
|
||||||
# config = ''
|
|
||||||
# [MI50_Cooling]
|
|
||||||
# # The lm96163 controller
|
|
||||||
# hwmon = hwmon0
|
|
||||||
|
|
||||||
# # Most lm96163 chips use pwm1 for the main fan header
|
|
||||||
# pwm = 1
|
|
||||||
# pwm = 2
|
|
||||||
|
|
||||||
# # Watch both MI50 cards
|
|
||||||
# sensor = hwmon3/temp1_input
|
|
||||||
# sensor = hwmon4/temp1_input
|
|
||||||
|
|
||||||
# # Servers cards need air early!
|
|
||||||
# # Starts spinning at 40C, full blast by 70C
|
|
||||||
# curve = 40:60 55:160 70:255
|
|
||||||
# '';
|
|
||||||
# };
|
|
||||||
|
|
||||||
# Private host ssh key managed by agenix
|
# Private host ssh key managed by agenix
|
||||||
age = {
|
age = {
|
||||||
identityPaths = paths.identities;
|
identityPaths = paths.identities;
|
||||||
@@ -251,16 +250,33 @@
|
|||||||
mode = "0600";
|
mode = "0600";
|
||||||
path = "/etc/ssh/ssh_host_ed25519_key";
|
path = "/etc/ssh/ssh_host_ed25519_key";
|
||||||
};
|
};
|
||||||
n8n_ssh_key = {
|
ai_ssh_key = {
|
||||||
file = ../../secrets/n8n_ssh_key.age;
|
file = ../../secrets/ai_ssh_key.age;
|
||||||
owner = "root";
|
owner = "root";
|
||||||
group = "root";
|
group = "root";
|
||||||
mode = "0600";
|
mode = "0600";
|
||||||
path = "/home/n8n-worker/.ssh/n8n_ssh_key";
|
path = "/home/ai-worker/.ssh/ai_ssh_key";
|
||||||
|
};
|
||||||
|
openclaw_gateway_token = {
|
||||||
|
file = ../../secrets/openclaw_gateway_token.age;
|
||||||
|
owner = "root";
|
||||||
|
group = "ai-worker";
|
||||||
|
mode = "0440";
|
||||||
|
path = "/run/secrets/openclaw_gateway_token";
|
||||||
};
|
};
|
||||||
};
|
};
|
||||||
};
|
};
|
||||||
|
|
||||||
|
# OpenClaw Node service (host-side execution for Docker gateway)
|
||||||
|
services.openclaw-node = {
|
||||||
|
enable = true;
|
||||||
|
user = "ai-worker";
|
||||||
|
gatewayHost = "127.0.0.1";
|
||||||
|
gatewayPort = 18789;
|
||||||
|
gatewayTokenFile = "/run/secrets/openclaw_gateway_token";
|
||||||
|
displayName = "lazyworkhorse-host";
|
||||||
|
};
|
||||||
|
|
||||||
# Public host ssh key (kept in sync with the private one)
|
# Public host ssh key (kept in sync with the private one)
|
||||||
environment.etc."ssh/ssh_host_ed25519_key.pub".text =
|
environment.etc."ssh/ssh_host_ed25519_key.pub".text =
|
||||||
"${keys.hosts.lazyworkhorse.main}";
|
"${keys.hosts.lazyworkhorse.main}";
|
||||||
@@ -276,7 +292,6 @@
|
|||||||
enable32Bit = true; # Useful for some compatibility layers
|
enable32Bit = true; # Useful for some compatibility layers
|
||||||
extraPackages = with pkgs; [
|
extraPackages = with pkgs; [
|
||||||
rocmPackages.clr.icd # OpenCL/HIP runtime
|
rocmPackages.clr.icd # OpenCL/HIP runtime
|
||||||
amdvlk # Vulkan drivers
|
|
||||||
];
|
];
|
||||||
};
|
};
|
||||||
nixpkgs.config.rocmTargets = [ "gfx906" ];
|
nixpkgs.config.rocmTargets = [ "gfx906" ];
|
||||||
|
|||||||
@@ -6,7 +6,7 @@
|
|||||||
gitea = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIN9tKezYidZglWBRI9/2I/cBGUUHj2dHY8rHXppYmf7F";
|
gitea = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIN9tKezYidZglWBRI9/2I/cBGUUHj2dHY8rHXppYmf7F";
|
||||||
};
|
};
|
||||||
|
|
||||||
n8n-worker = {
|
ai-worker = {
|
||||||
main = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAXeGtPPcsP2IYRQNvII41NVWhJsarEk8c4qxs/a5sXf";
|
main = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIAXeGtPPcsP2IYRQNvII41NVWhJsarEk8c4qxs/a5sXf";
|
||||||
};
|
};
|
||||||
};
|
};
|
||||||
|
|||||||
@@ -1,7 +0,0 @@
|
|||||||
{ pkgs, lib, config, ... }: {
|
|
||||||
imports =
|
|
||||||
[
|
|
||||||
# ./home
|
|
||||||
./nixos
|
|
||||||
];
|
|
||||||
}
|
|
||||||
@@ -1,6 +0,0 @@
|
|||||||
{ pkgs, lib, config, ... }: {
|
|
||||||
imports =
|
|
||||||
[
|
|
||||||
./graphical-desktop.nix
|
|
||||||
];
|
|
||||||
}
|
|
||||||
@@ -1,9 +0,0 @@
|
|||||||
{ pkgs, lib, config, ... }: {
|
|
||||||
imports =
|
|
||||||
[
|
|
||||||
./bundles
|
|
||||||
# ./programs
|
|
||||||
./services
|
|
||||||
./filesystem
|
|
||||||
];
|
|
||||||
}
|
|
||||||
@@ -1,6 +0,0 @@
|
|||||||
{ pkgs, lib, config, ... }: {
|
|
||||||
imports =
|
|
||||||
[
|
|
||||||
./hoardingcow-mount.nix
|
|
||||||
];
|
|
||||||
}
|
|
||||||
@@ -1,6 +0,0 @@
|
|||||||
{ pkgs, lib, config, ... }: {
|
|
||||||
imports =
|
|
||||||
[
|
|
||||||
./systemd
|
|
||||||
];
|
|
||||||
}
|
|
||||||
@@ -9,6 +9,12 @@ with lib;
|
|||||||
path = mkOption { type = types.str; };
|
path = mkOption { type = types.str; };
|
||||||
envFile = mkOption { type = types.nullOr types.path; default = null; };
|
envFile = mkOption { type = types.nullOr types.path; default = null; };
|
||||||
ports = mkOption { type = types.listOf types.int; default = [ ]; };
|
ports = mkOption { type = types.listOf types.int; default = [ ]; };
|
||||||
|
# New option to pass raw systemd serviceConfig
|
||||||
|
serviceConfig = mkOption {
|
||||||
|
type = types.attrs;
|
||||||
|
default = { };
|
||||||
|
description = "Extra systemd serviceConfig options for this stack.";
|
||||||
|
};
|
||||||
};
|
};
|
||||||
});
|
});
|
||||||
default = { };
|
default = { };
|
||||||
@@ -23,28 +29,29 @@ with lib;
|
|||||||
systemd.services = mapAttrs' (name: value: nameValuePair "${name}_stack" {
|
systemd.services = mapAttrs' (name: value: nameValuePair "${name}_stack" {
|
||||||
description = "Docker Compose stack: ${name}";
|
description = "Docker Compose stack: ${name}";
|
||||||
|
|
||||||
# Added 'docker.socket' to both after and wants to ensure the API is reachable
|
# Forces systemd to restart when the files change
|
||||||
|
reloadTriggers = [
|
||||||
|
"${builtins.hashFile "sha256" (toString value.path + "/compose.yml")}"
|
||||||
|
] ++ (lib.optional (value.envFile != null) "${value.envFile}");
|
||||||
|
|
||||||
after = [ "network.target" "docker.service" "docker.socket" "agenix.service" ];
|
after = [ "network.target" "docker.service" "docker.socket" "agenix.service" ];
|
||||||
wants = [ "docker.socket" "agenix.service" ];
|
wants = [ "docker.socket" "agenix.service" ];
|
||||||
requires = [ "docker.service" ];
|
requires = [ "docker.service" ];
|
||||||
|
|
||||||
wantedBy = [ "multi-user.target" ];
|
wantedBy = [ "multi-user.target" ];
|
||||||
|
|
||||||
serviceConfig = {
|
path = with pkgs; [ git docker docker-compose bash ];
|
||||||
|
|
||||||
|
# We merge the base config with the custom 'serviceConfig' from the submodule
|
||||||
|
serviceConfig = recursiveUpdate {
|
||||||
Type = "oneshot";
|
Type = "oneshot";
|
||||||
WorkingDirectory = value.path;
|
WorkingDirectory = value.path;
|
||||||
User = "root";
|
User = "root";
|
||||||
|
|
||||||
# This line forces the service to wait until the docker socket is actually responsive
|
|
||||||
ExecStartPre = "${pkgs.bash}/bin/bash -c 'while [ ! -S /var/run/docker.sock ]; do sleep 1; done'";
|
ExecStartPre = "${pkgs.bash}/bin/bash -c 'while [ ! -S /var/run/docker.sock ]; do sleep 1; done'";
|
||||||
|
|
||||||
ExecStart = "${pkgs.docker-compose}/bin/docker-compose up -d --remove-orphans";
|
ExecStart = "${pkgs.docker-compose}/bin/docker-compose up -d --remove-orphans";
|
||||||
ExecStop = "${pkgs.docker-compose}/bin/docker-compose down";
|
ExecStop = "${pkgs.docker-compose}/bin/docker-compose down";
|
||||||
RemainAfterExit = true;
|
RemainAfterExit = true;
|
||||||
|
|
||||||
# Ensure the environment file is passed correctly
|
|
||||||
EnvironmentFile = mkIf (value.envFile != null) [ value.envFile ];
|
EnvironmentFile = mkIf (value.envFile != null) [ value.envFile ];
|
||||||
};
|
} value.serviceConfig;
|
||||||
}) config.services.dockerStacks;
|
}) config.services.dockerStacks;
|
||||||
};
|
};
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -20,11 +20,7 @@ in {
|
|||||||
|
|
||||||
environment.etc."opencode/opencode.json".text = builtins.toJSON {
|
environment.etc."opencode/opencode.json".text = builtins.toJSON {
|
||||||
"$schema" = "https://opencode.ai/config.json";
|
"$schema" = "https://opencode.ai/config.json";
|
||||||
"model" = "devstral-2-small-llama_cpp";
|
"model" = "nemotron-3-nano-llama_cpp";
|
||||||
|
|
||||||
# MCP servers for web search and enhanced functionality
|
|
||||||
# context7: Remote HTTP server for up-to-date documentation and code examples
|
|
||||||
# duckduckgo: Local MCP server for web search capabilities
|
|
||||||
"mcp" = {
|
"mcp" = {
|
||||||
"context7" = {
|
"context7" = {
|
||||||
"type" = "remote";
|
"type" = "remote";
|
||||||
@@ -46,6 +42,7 @@ in {
|
|||||||
"options" = {
|
"options" = {
|
||||||
"baseURL" = "http://localhost:8300/v1";
|
"baseURL" = "http://localhost:8300/v1";
|
||||||
"apiKey" = "not-needed";
|
"apiKey" = "not-needed";
|
||||||
|
"maxTokens" = 80000;
|
||||||
};
|
};
|
||||||
"models" = {
|
"models" = {
|
||||||
"devstral-2-small-llama_cpp" = {
|
"devstral-2-small-llama_cpp" = {
|
||||||
@@ -53,6 +50,11 @@ in {
|
|||||||
"tools" = true;
|
"tools" = true;
|
||||||
"reasoning" = false;
|
"reasoning" = false;
|
||||||
};
|
};
|
||||||
|
"nemotron-3-nano-llama_cpp" = {
|
||||||
|
"name" = "Nemotron 3 nano 30B Q8 (llama.cpp)";
|
||||||
|
"tools" = true;
|
||||||
|
"reasoning" = false;
|
||||||
|
};
|
||||||
};
|
};
|
||||||
};
|
};
|
||||||
"ollama" = {
|
"ollama" = {
|
||||||
@@ -76,6 +78,7 @@ in {
|
|||||||
systemd.services.opencode-gsd-install = {
|
systemd.services.opencode-gsd-install = {
|
||||||
description = "Install Get Shit Done OpenCode Components";
|
description = "Install Get Shit Done OpenCode Components";
|
||||||
after = [ "network-online.target" ];
|
after = [ "network-online.target" ];
|
||||||
|
wants = [ "network-online.target" ];
|
||||||
wantedBy = [ "multi-user.target" ];
|
wantedBy = [ "multi-user.target" ];
|
||||||
path = with pkgs; [
|
path = with pkgs; [
|
||||||
nodejs
|
nodejs
|
||||||
@@ -131,7 +134,6 @@ in {
|
|||||||
|
|
||||||
environment = {
|
environment = {
|
||||||
OLLAMA_BASE_URL = "http://127.0.0.1:11434";
|
OLLAMA_BASE_URL = "http://127.0.0.1:11434";
|
||||||
# Important: GSD at ~/.config/opencode, so we ensure the server sees our /etc config
|
|
||||||
OPENCODE_CONFIG = "/etc/opencode/opencode.json";
|
OPENCODE_CONFIG = "/etc/opencode/opencode.json";
|
||||||
HOME = "/home/gortium";
|
HOME = "/home/gortium";
|
||||||
NODE_PATH = "${pkgs.nodejs}/lib/node_modules";
|
NODE_PATH = "${pkgs.nodejs}/lib/node_modules";
|
||||||
|
|||||||
64
modules/nixos/services/openclaw_node.nix
Normal file
64
modules/nixos/services/openclaw_node.nix
Normal file
@@ -0,0 +1,64 @@
|
|||||||
|
{ config, lib, pkgs, ... }:
|
||||||
|
|
||||||
|
let
|
||||||
|
cfg = config.services.openclaw-node;
|
||||||
|
openclawPkg = pkgs.openclaw;
|
||||||
|
in {
|
||||||
|
options.services.openclaw-node = {
|
||||||
|
enable = lib.mkEnableOption "OpenClaw Node service";
|
||||||
|
|
||||||
|
user = lib.mkOption {
|
||||||
|
type = lib.types.str;
|
||||||
|
default = "ai-worker";
|
||||||
|
description = "User to run the OpenClaw headless node as.";
|
||||||
|
};
|
||||||
|
|
||||||
|
gatewayHost = lib.mkOption {
|
||||||
|
type = lib.types.str;
|
||||||
|
default = "127.0.0.1";
|
||||||
|
description = "Gateway host (IP or hostname).";
|
||||||
|
};
|
||||||
|
|
||||||
|
gatewayPort = lib.mkOption {
|
||||||
|
type = lib.types.int;
|
||||||
|
default = 18789;
|
||||||
|
description = "Gateway WebSocket port.";
|
||||||
|
};
|
||||||
|
|
||||||
|
gatewayTokenFile = lib.mkOption {
|
||||||
|
type = lib.types.str;
|
||||||
|
default = "";
|
||||||
|
description = "Path to file containing the gateway auth token.";
|
||||||
|
};
|
||||||
|
|
||||||
|
displayName = lib.mkOption {
|
||||||
|
type = lib.types.str;
|
||||||
|
default = "lazyworkhorse-host";
|
||||||
|
description = "Display name for this node (shown in pairing).";
|
||||||
|
};
|
||||||
|
};
|
||||||
|
|
||||||
|
config = lib.mkIf cfg.enable {
|
||||||
|
systemd.services.openclaw-node = {
|
||||||
|
description = "OpenClaw Headless Node Service";
|
||||||
|
after = [ "network.target" ];
|
||||||
|
wantedBy = [ "multi-user.target" ];
|
||||||
|
|
||||||
|
serviceConfig = {
|
||||||
|
Type = "exec";
|
||||||
|
User = cfg.user;
|
||||||
|
Group = cfg.user;
|
||||||
|
WorkingDirectory = "/home/${cfg.user}";
|
||||||
|
ExecStart = ''
|
||||||
|
${pkgs.bash}/bin/bash -c 'export OPENCLAW_GATEWAY_TOKEN=$(cat ${cfg.gatewayTokenFile}) && exec ${openclawPkg}/bin/openclaw node run --host ${cfg.gatewayHost} --port ${toString cfg.gatewayPort} --display-name "${cfg.displayName}"'
|
||||||
|
'';
|
||||||
|
Restart = "always";
|
||||||
|
RestartSec = 5;
|
||||||
|
};
|
||||||
|
|
||||||
|
environment = {
|
||||||
|
NODE_ENV = "production";
|
||||||
|
};
|
||||||
|
};
|
||||||
|
};
|
||||||
|
}
|
||||||
@@ -1,24 +1,34 @@
|
|||||||
-----BEGIN AGE ENCRYPTED FILE-----
|
-----BEGIN AGE ENCRYPTED FILE-----
|
||||||
YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IHNzaC1lZDI1NTE5IEdoTUQ4QSBGWmpW
|
YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IHNzaC1lZDI1NTE5IEdoTUQ4QSBOL29w
|
||||||
bFFuT1FNWVlsd0twcUJnYXV0T0Z3Q0RDZldsNTUwWlprQTJaK2xNCmMzS3g1OEdI
|
eGk1N2xxTHJtaUEvWWZmbkh1bk11Tjk3anNnMDB1cCtPYUMzdTNJCkdhQ08vblNG
|
||||||
bENzekRFTkIwbVRua2MzTVdnZmNKMnd6dzJjZEx5eXhBWmMKLT4gV2ktZ3JlYXNl
|
UlV1K2xVTGZVTzFWYXAzcjZaMWs0RTFWdStKSmlSTURvK1EKLT4gLC1zKU8zVkgt
|
||||||
IChaQl14QSB3IFlIcmkKVHZPSmZ2aXNaSHVUbi9UbUNTL00ycWRZbzVwTlFUUjls
|
Z3JlYXNlIFUiXFcpS302IHByVn5jOy0gRDMKQjV3SHpDWUIybGFyQUg3ZlR0R2hV
|
||||||
Z2RFSGMyM2ZDbkRlekxxemR4RTlLWnI3L0FlanpkYgpaaUlpSFdxZlo0Sk9XcXF3
|
eWM3SFlCVW5mdlpBVUF3a0xpNlZCeGNUd1oxTTlkc1RkTXdZS0lFTmN3Ci0tLSA3
|
||||||
TnZQYzY1MWxLRklycWh3MEl2ZENSMk5yMDNKNWkyZmVBNTlSNWxBSzZ2RDNmeDRP
|
VlBqM1VLWllZc0JnOTMvUFRjMU13OTdzMmhsdGJubkk5eGpERVVLYUk4Cnzh5UbU
|
||||||
CgotLS0gNEtpRlhJbkZXcGNpQzBFREhCempyYlFHcTRHSlpTOUZFeGxmNHk2c20x
|
FlgqpM8jkJ6XlsaIDCw/G3D6uJ/GRJW4gIekuhAUxpZJrc8eOA8ZuHfGrBbH3acV
|
||||||
VQrxqxWUB/GZUQixOXxdZhfeUDyzbc7DZ4CMA8o0X0NHxxonsHQXvAwcHFYVBj45
|
tVafX5F0Kr2oOblqZ6gduZOUS52KmWH8stiBJM+e5ZZ7zRQVE4PJUKUPCzi+WdcH
|
||||||
d7D9yjtHYP+EAR2skUEnlPYfUdFKtjyE4KRE/wv6VQXfjeIax0USypvuEg9e+cfA
|
zr295T//FOdicrYHdsjfziKEHzBtUCFiATW05+O2zMjYjO6cPzePcCzPWinwiID6
|
||||||
VknSLO4G+si8MvccJNZsBGGebEg8OpmSqSog6pee3jeVtr0fr5no0901rnwZYQEN
|
V+f6ngfkkQaj3wBGkzaieQJzRcdSwky21aVhGCCX/bvqx61iW2d5QAKxGbtQ2RcG
|
||||||
X63i+8cp2ZnHCxuR6ol48rUB9AEieYiYvI8gCfATigvFkjj/fEYKLK/kgqLVl96p
|
X1okr+xunAM94nzDMv46vyN97KxY7cZd4pAaOxoICc2Tfhtw6F+iS6QkQh1odJzO
|
||||||
CjtXqhO0XGROPCvyVB8yadJCw67tMdkZO39saJTeHP6r0lz37lHNm8Uwyel89kLd
|
7ZH+sSQCvndG+8z9shXGiHalASF5tdguM+JlEvAGljcaiAUtsQWxr9CoWiEkC6c6
|
||||||
CWqrIK67MH1ejXwhTfQlHSX3WQYAXfxq7fmetjcJb0NBXUBsPrAwlmz49T0TWvfa
|
NCaECSYO8Il+SXBQnSZSGJSNDhuPYCYrsjXGSAONFixuyeslAkq9x2WUaUS4H063
|
||||||
1oi60xLD+BsKR3KDgthid3GwhcrsY5RA8y8x8c4Ssk1iLKEIlyOM+f2cYJRvYMrS
|
1QvRF7XO2tBPtgCLsSjdiGp0h+ImUaGdu6fDR7zrDsGsaAFCSFeH/rGNNXRQ2vP2
|
||||||
LfSs1cvIORLA8QcADELhzV7mVsBtXo8vU5oSoCWrvT0vs2H2EFvl4Qfx/8UGoVMK
|
CSfPfDDCqpUSCn0WuA30BtaPLxGmZT6OjFevKzYMNDmdeq9ia/q8K0hmjLUBdN3k
|
||||||
p3HFMw3Qwxh2Qyr6kD6SuRc1dzbseXiBtPuN76KOQNbo9LEu0JNwsoHqv7wdUS6u
|
tdYWbwoaf4gYbUWxSleD768b0Jgxss9Vod+sFQ+NYRksdGIeyND+aQIc312XehfA
|
||||||
r831UKyTxWfl3oBUzldG2Ugka3/7wr3n2biARkADNjrvkFHo5BM6vYla583j6ml3
|
qHFBS8nlj7eUF5bdvCYQ64z741mH4cNlGxyjPBH1x8FHnEOocJXYt1l2AZSRJmJA
|
||||||
/IzQOIQXSmgv+opza1oghf2jg9UFkMOPZ9iz6srg2xaH+xZ7+xnL3cuY4ngWwIqy
|
c3z0QGXyuCbsrLBXWK1EKa/Juo4PGGsEVoLRhwJAQy9+i1JN0yrfRvSPyzvD4px6
|
||||||
pRKdcrNDOIawhEpJEAUYLHMcrCCekZPJalEcMZ26pXjVG1p9SYVsQWxkpVgOqEIH
|
wRPzlZ80MQdb2lv84WS/zcOEZmZzlLntszTRRdIfAsuaavP2Rquh4rEXABYeTZwp
|
||||||
8Q4zYMYQAQssVSED3SrQ39giW7+UfGnoqsy9qTq1UvDBpnGDMk2JYsGZmQoWEvtJ
|
5dem79s8bdW2nFsGMNz1OQKQwocyjYu1jJMHu6Gp7Ngdl1xyW7xfg0dezE1c0cIh
|
||||||
AudwoHTFj/szABXE7qootqjGGhopdC0pFWGKaSFRre7iIeiYNJDXYi1lyAtDfZFW
|
xt1aLER9YJp4n5to5cOH16l3mjDHnAvABx38xE9loNL3399J/evw7LxpTYQ4v2Xv
|
||||||
iv8avbywunozAigA8+wuF4Zw1GOThPAOLNU=
|
x8xnDHcqJ+deFSwyuUnMS5DkUeYuHmUl0Q2WYcfY+ibCmcgCb2ObTtuN1/ZxNYrL
|
||||||
|
OKrnmfuSvBgyuIOj5e6uWW0+Zs8dHKXu2TgV8WignxOhl5zQgCpCBlqVfO0t+NCu
|
||||||
|
Gi26hU/fhGWQ/1oQa3VkpGsypZbJpgQvfWxfcGHP/MMhnl01zzlP8/aexSY3pAxf
|
||||||
|
fz9v0IVh6xxtu3zbiiVzUsXbfG7t+xY98jMphf4AS2mWva3GWVmhhu0lS3J3P+go
|
||||||
|
YEEP4rOFHeU0Y1/6kLydTXvz4jMH0H92XQIzshd7vzQnEJPUPAzqRmw3LKYGgCI+
|
||||||
|
wZEnxJ6ckqTkGBFnxTpy9LLllwmnz2Ky87nY3XAmqxlhb2Ap1XFAlfgszmGjc+Il
|
||||||
|
KkIgoWQHTUm6QM9ta++oUTIDneOvxGd0zZsqoEhiC/7E01BNNZ6E58TeJU3fDlA3
|
||||||
|
mX6n05XjwPRpgXZfayPoAgBlZc2H4KeiynxwNZ/dWu7qz7L6Ppk6Nvtly8giTbFx
|
||||||
|
CA+tto7vq+D+CAEJ4bgyq4BCH4GL4APrhPcWp98Mko1WCiRTIKgkZxQCYvlg/LZq
|
||||||
|
LNhMacP9T1qTvNC+yR1NEMiegE3APzk6CkDpVaO9+5f/sqifNPINCMothenI9ePw
|
||||||
|
zjQLI3Mo1m73bkomytUZ7i1VstP5sEZ5LF72Sq7BpR3oQ3Gp0CAN9w==
|
||||||
-----END AGE ENCRYPTED FILE-----
|
-----END AGE ENCRYPTED FILE-----
|
||||||
|
|||||||
11
secrets/openclaw_gateway_token.age
Normal file
11
secrets/openclaw_gateway_token.age
Normal file
@@ -0,0 +1,11 @@
|
|||||||
|
-----BEGIN AGE ENCRYPTED FILE-----
|
||||||
|
YWdlLWVuY3J5cHRpb24ub3JnL3YxCi0+IHNzaC1lZDI1NTE5IEdoTUQ4QSBCWEpO
|
||||||
|
cG9yNnFpcHFqTkNzTngxU1MxN0NYK0hrZFhUTjVORWFrK3JNd2tZCmtMTGpwQk1E
|
||||||
|
WlUwL3N6SGRWblpnNEkrWkkyU2hQMkRIK0M3R0pOVEREV3MKLT4gY2osLWdyZWFz
|
||||||
|
ZSBacSozVVQgUCAxRS1OQSAuKXxDPCoKbStWNW1BZjBZQzNDaTlDbU5EZkxsRWxM
|
||||||
|
cXJ3dDU1RDNpOXRlV0tzdEp2NUo3S1lhRG5Md0RHTGlJdkFSYmt5YQo4R1hiQWRG
|
||||||
|
V2VxekJKZwotLS0geG1XSi9VbkhXZHQzcEFVS3hKNzVueXFLa2xnZTc3Q2tJTVZ5
|
||||||
|
eXJabWk5Ywp6bJCP3s0xxzjE+eTR+cv7ZUnkoliT/n7uIprq1BTn/LIRLkUTUqs3
|
||||||
|
NiDwrXcoq4/QKd0Dt+8ap3vFAuusjGxRlnYMaRrZie2AGtTV8U7Q7durm9o2K+/4
|
||||||
|
QzRQ/MtumIQm
|
||||||
|
-----END AGE ENCRYPTED FILE-----
|
||||||
@@ -10,4 +10,5 @@ in
|
|||||||
"containers.env.age".publicKeys = authorizedKeys;
|
"containers.env.age".publicKeys = authorizedKeys;
|
||||||
"lazyworkhorse_host_ssh_key.age".publicKeys = authorizedKeys;
|
"lazyworkhorse_host_ssh_key.age".publicKeys = authorizedKeys;
|
||||||
"n8n_ssh_key.age".publicKeys = authorizedKeys;
|
"n8n_ssh_key.age".publicKeys = authorizedKeys;
|
||||||
|
"openclaw_gateway_token.age".publicKeys = authorizedKeys;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,12 +1,14 @@
|
|||||||
{ pkgs, inputs, config, keys, ... }: {
|
{ pkgs, inputs, config, keys, ... }: {
|
||||||
users.users.n8n-worker = {
|
users.users.ai-worker = {
|
||||||
isSystemUser = true;
|
isSystemUser = true;
|
||||||
group = "n8n-worker";
|
group = "ai-worker";
|
||||||
|
home = "/home/ai-worker";
|
||||||
|
createHome = true;
|
||||||
extraGroups = [ "docker" ];
|
extraGroups = [ "docker" ];
|
||||||
shell = pkgs.bashInteractive;
|
shell = pkgs.bashInteractive;
|
||||||
openssh.authorizedKeys.keys = [
|
openssh.authorizedKeys.keys = [
|
||||||
keys.users.n8n-worker.main
|
keys.users.ai-worker.main
|
||||||
];
|
];
|
||||||
};
|
};
|
||||||
users.groups.n8n-worker = {};
|
users.groups.ai-worker = {};
|
||||||
}
|
}
|
||||||
Reference in New Issue
Block a user