SecIDS-v2 / README.md

Update README.md

5053c44 verified 3 months ago

9.73 kB

	---
	language: en
	license: cc-by-nc-4.0
	tags:
	- automotive
	- intrusion-detection
	- can-bus
	- cybersecurity
	- temporal-cnn
	- pytorch-lightning
	- onnx
	- tensorrt
	datasets:
	- car-hacking-challenge-2021
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	library_name: pytorch
	---

	# SecIDS-v2: Next-Generation Automotive Intrusion Detection System

	![GitHub Banner](visuals/github_banner.png)

	## Model Description

	SecIDS-v2 is a production-ready deep learning system for detecting cyber attacks on automotive CAN (Controller Area Network) buses. Built with Temporal Convolutional Networks (TCN), it achieves state-of-the-art performance while maintaining real-time inference speeds suitable for embedded deployment on NVIDIA Jetson devices.

	### Key Features

	- High Performance: 98.2% detection accuracy with 4.2ms inference latency on Jetson Nano
	- Multi-Task Learning: Simultaneous detection of multiple attack types (DoS, Fuzzy, Spoofing, Replay)
	- Production-Ready: Complete deployment pipeline with ONNX/TensorRT export, FastAPI server, and Streamlit dashboard
	- Advanced Feature Engineering: 25 CAN-specific features including temporal, payload, and statistical attributes
	- Edge-Optimized: INT8 quantization support for resource-constrained automotive ECUs

	## Architecture

	![Model Architecture](visuals/architecture.png)

	SecIDS-v2 uses a Temporal Convolutional Network (TCN) with the following structure:

	- Input: Sliding windows of 128 CAN frames × 25 features
	- TCN Backbone: 3 blocks with dilated convolutions (32→64→128 filters, dilations 1→2→4)
	- Receptive Field: 128 frames (captures long-range temporal dependencies)
	- Multi-Task Heads: 4 classification heads for different attack types
	- Parameters: 3.8M (27% smaller than LSTM v1)
	- Output: Binary classification + attack type prediction

	## Performance

	![Performance Comparison](visuals/performance_comparison.png)

	### SecIDS v1 → v2 Improvements

	\| Metric \| LSTM v1 \| TCN v2 \| Improvement \|
	\|--------\|---------\|--------\|-------------\|
	\| Accuracy \| 97.2% \| 98.2% \| +1.0% \|
	\| Inference (Jetson Nano) \| 18.5ms \| 4.2ms \| 4.4× faster \|
	\| Model Size \| 5.2M params \| 3.8M params \| -27% \|
	\| F1-Score (DoS) \| 96.5% \| 98.1% \| +1.6% \|
	\| F1-Score (Fuzzy) \| 95.8% \| 97.9% \| +2.1% \|
	\| F1-Score (Spoofing) \| 96.2% \| 98.5% \| +2.3% \|
	\| F1-Score (Replay) \| 97.1% \| 98.3% \| +1.2% \|

	### Hardware Performance

	\| Device \| Precision \| Latency \| Throughput \|
	\|--------\|-----------\|---------\|------------\|
	\| NVIDIA Jetson Nano \| FP16 \| 4.2ms \| 238 FPS \|
	\| NVIDIA Jetson Nano \| INT8 \| 2.8ms \| 357 FPS \|
	\| NVIDIA Jetson Xavier NX \| FP16 \| 1.9ms \| 526 FPS \|
	\| Intel Core i7 (CPU) \| FP32 \| 12.5ms \| 80 FPS \|
	\| NVIDIA RTX 4060 \| FP32 \| 0.8ms \| 1250 FPS \|

	## Feature Importance

	![Feature Importance](visuals/feature_importance.png)

	Top 10 Most Important Features:

	1. Inter-Arrival Time (Δt) - Time between consecutive frames
	2. Payload Entropy - Randomness of data payload
	3. Hamming Distance - Bit-level changes between frames
	4. ID Change Frequency - Rate of CAN ID transitions
	5. DLC Variance - Data Length Code variability
	6. ID Occurrence Rate - Frequency of specific CAN IDs
	7. Payload Mean - Average payload byte values
	8. Payload Std Dev - Payload variability
	9. Time-Since-Last - Time since last occurrence of ID
	10. ID Diversity - Number of unique IDs in window

	## Training Data

	Primary Dataset: [Car Hacking Challenge 2021](https://ocslab.hksecurity.net/Datasets/CAN-intrusion-dataset)

	- Total Frames: ~200,000 CAN frames
	- Normal Traffic: ~180,000 frames (90%)
	- Attack Types: DoS, Fuzzy, Spoofing, Gear Replay
	- Attack Frames: ~20,000 frames (10%)
	- Train/Val Split: 70/30
	- Window Size: 128 frames with 50% overlap

	### Data Preprocessing

	1. Feature Extraction: 25 engineered features per frame
	- Temporal: Inter-arrival time, time-since-last, sequence position
	- Payload: Entropy, mean, std, Hamming distance
	- Statistical: Per-ID aggregates, DLC variance, ID diversity

	2. Normalization: StandardScaler (μ=0, σ=1)

	3. Augmentation (training only):
	- Bit-flip injection (5% probability)
	- Temporal jitter (±2ms)
	- Random masking (10% features)

	## Intended Use

	### Primary Use Cases

	- Automotive Cybersecurity: Real-time intrusion detection in connected vehicles
	- CAN Bus Monitoring: Network anomaly detection in industrial/automotive systems
	- Security Research: Baseline model for CAN-bus attack detection research
	- Education: Reference implementation for automotive security courses

	### Out-of-Scope Use

	- Non-CAN Protocols: Not designed for FlexRay, LIN, or Ethernet automotive networks
	- Safety-Critical Control: Should not replace functional safety mechanisms (ISO 26262)
	- Guaranteed Protection: No ML model provides 100% security; defense-in-depth required

	## Limitations

	- Training Data Bias: Trained primarily on synthesized attack scenarios
	- Zero-Day Attacks: May not detect novel attack patterns not seen during training
	- Context Dependence: Performance may vary across different vehicle platforms
	- Latency vs Accuracy Trade-off: Optimized for speed; may miss subtle attacks
	- False Positives: ~1.8% false alarm rate may require tuning for production

	## Usage

	### Quick Start (Python)

	```python
	import torch
	from secids.models import TemporalCNN
	from secids.data import CANPreprocessor

	# Load model
	model = TemporalCNN.load_from_checkpoint("final_model.ckpt")
	model.eval()

	# Preprocess CAN data
	preprocessor = CANPreprocessor()
	features = preprocessor.transform(can_frames) # [128, 25]

	# Inference
	with torch.no_grad():
	logits = model(features.unsqueeze(0)) # [1, 128, 25]
	pred = torch.argmax(logits, dim=-1)

	print(f"Attack Detected: {pred.item() == 1}")
	```

	### ONNX Deployment

	```python
	import onnxruntime as ort

	# Load ONNX model
	session = ort.InferenceSession("secids_v2.onnx")

	# Run inference
	outputs = session.run(None, {"input": features.numpy()})
	prediction = outputs[0].argmax()
	```

	### FastAPI Server

	```bash
	# Start REST API server
	cd serving
	python app.py

	# Make prediction request
	curl -X POST http://localhost:8080/predict \
	-H "Content-Type: application/json" \
	-d @can_sample.json
	```

	### Streamlit Dashboard

	```bash
	# Start web dashboard
	cd serving
	streamlit run dashboard.py --server.port 5060
	```

	## Training

	### Requirements

	```bash
	pip install torch torchvision pytorch-lightning
	pip install pandas numpy pyarrow
	pip install scikit-learn wandb
	```

	### Training Script

	```bash
	python scripts/train.py \
	--model tcn \
	--data data/processed/train.parquet \
	--batch_size 32 \
	--epochs 50 \
	--gpus 1 \
	--precision 16
	```

	### Hyperparameters

	- Optimizer: AdamW (lr=1e-3, weight_decay=1e-4)
	- Scheduler: ReduceLROnPlateau (patience=5, factor=0.5)
	- Loss: CrossEntropyLoss with class weights [1.0, 10.0]
	- Batch Size: 32
	- Window Size: 128 frames
	- Stride: 64 frames (50% overlap)
	- Early Stopping: Patience=10 epochs

	## Model Export

	### ONNX Export

	```python
	from secids.models import TemporalCNN
	import torch

	model = TemporalCNN.load_from_checkpoint("model.ckpt")
	dummy_input = torch.randn(1, 128, 25)

	torch.onnx.export(
	model,
	dummy_input,
	"secids_v2.onnx",
	input_names=["input"],
	output_names=["output"],
	dynamic_axes={"input": {0: "batch"}}
	)
	```

	### TensorRT Optimization

	```bash
	# Convert ONNX to TensorRT (FP16)
	trtexec --onnx=secids_v2.onnx \
	--saveEngine=secids_v2_fp16.trt \
	--fp16

	# Convert to INT8 (requires calibration data)
	trtexec --onnx=secids_v2.onnx \
	--saveEngine=secids_v2_int8.trt \
	--int8 \
	--calib=calibration.cache
	```

	## Evaluation

	### Test Set Performance

	```bash
	python scripts/evaluate.py \
	--model outputs/tcn_production/final_model.ckpt \
	--data data/processed/test.parquet \
	--output results/
	```

	Outputs:
	- Confusion matrix (PNG)
	- ROC/PR curves (PNG)
	- Per-attack-type metrics (JSON)
	- Latency profiling (CSV)

	### Benchmark Results

	\| Dataset \| Accuracy \| Precision \| Recall \| F1-Score \|
	\|---------\|----------\|-----------\|--------\|----------\|
	\| Car Hacking (2021) \| 98.2% \| 97.8% \| 98.6% \| 98.2% \|
	\| HCRL (2020) \| 97.5% \| 96.9% \| 98.1% \| 97.5% \|
	\| SynCAN (2023) \| 96.8% \| 95.7% \| 97.9% \| 96.8% \|

	## Citation

	```bibtex
	@software{secids_v2_2025,
	author = {Hardani, Keyvan},
	title = {SecIDS-v2: Next-Generation Automotive Intrusion Detection System},
	year = {2025},
	url = {https://github.com/Keyvanhardani/SecIDS-v2},
	note = {Production-ready CAN-bus intrusion detection with Temporal CNNs}
	}
	```

	## Related Work

	- SecIDS v1: LSTM-based predecessor (97.2% accuracy, 18.5ms latency)
	- CANnolo: YOLO-inspired object detection approach
	- GIDS: Graph neural networks for CAN security
	- Deep-CAN: Autoencoder-based anomaly detection

	## Acknowledgments

	- Dataset: OCSLab HK Security (Car Hacking Challenge 2021)
	- Framework: PyTorch Lightning team
	- Optimization: NVIDIA TensorRT team
	- Inspiration: Temporal CNN architecture from Bai et al. (2018)

	## License

	MIT License - See [LICENSE](LICENSE) for details

	## Contact

	- Author: Keyvan Hardani
	- GitHub: [Keyvanhardani/SecIDS-v2](https://github.com/Keyvanhardani/SecIDS-v2)
	- Issues: [GitHub Issues](https://github.com/Keyvanhardani/SecIDS-v2/issues)
	- Demo: [secids.keyvan.ai](http://secids.keyvan.ai)
	- Linkedin [Linkedin](https://www.linkedin.com/in/keyvanhardani/)

	---

	Last Updated: October 2025
	Model Version: 2.0.0
	Framework: PyTorch 2.0+, Lightning 2.0+