Projects
LLM-Based Comprehensive Detection of Firewall Rule Anomalies
Life & Journey

LLM-Based Comprehensive Detection of Firewall Rule Anomalies

2025-8-29

Paper

Chang-Sheng Lee, I-Chen Lee, Ling-Jyh Chen. "Enhancing Firewall Rule Anomaly Detection via LLM Alignment " . International Conference on Technologies and Applications of Artificial Intelligence, Taiwan, 2025

Motivation

  • Traditional firewall rule sets are difficult to maintain because old rules accumulate, leading to complexity and higher costs.
  • Detecting anomalies (e.g., shadowing, redundancy, correlation) in firewall rules is a critical first step before simplifying them.
  • Existing rule-based methods lack flexibility and generalization.
  • Large Language Models (LLMs) offer a promising alternative due to their ability to recognize patterns and generalize.

Methods

  1. Model Training

    • Used Supervised Fine-Tuning (SFT) with a small dataset (75 examples) that included reasoning steps and anomaly labels.
    • Applied Reinforcement Learning (RL) with ~36,000 examples using Group Relative Policy Optimization (GRPO).
    • Designed reward functions focusing on both format correctness and answer accuracy.
  2. Experiment Setup

    • Models: Qwen3-4B (Base and Instruct versions).
    • Training hardware: RTX 4090, H100 NVL/PCIe (via Runpods).
    • Framework: Unsloth (for efficient training).
  3. Testing

    • Compared combinations of Base/Instruct with SFT and/or RL.
    • Evaluated accuracy on anomaly detection tasks involving firewall rule pairs.

Results

  • Best performance: Instruct model with SFT + RL, achieving ~99.2% accuracy.

  • Both SFT and RL improved accuracy, but RL contributed more than SFT.

  • Pure Base model only achieved ~50% accuracy, Pure Instruct ~70%.

  • However, performance collapsed when evaluating multiple rules (100+ simultaneously).

    • Models were good at two-rule comparisons but failed to generalize to larger rule sets.

Conclusion

  • LLM alignment (SFT + RL) significantly enhances performance for detecting firewall rule anomalies in pairwise settings.
  • Reinforcement learning is particularly powerful, while SFT shows limited benefit due to small dataset size.
  • Current methods lack generalization to complex, multi-rule scenarios.
  • Future work should test more pretrained models, employ curriculum learning, and experiment with different training strategies (prompts, reward functions, hyperparameters).
Share...

Enjoyed this read?

Comments

Share your thoughts and feedback.

0 total