Cataloging Information
Forest fires not only destroy vegetation and directly decrease forested areas, but they also significantly impair forest stand structures and habitat conditions, ultimately leading to imbalances within the entire forest ecosystem. Therefore, accurate forest fire detection is critical for ecological safety and for protecting lives and property. However, existing algorithms often struggle with detecting flames and smoke in complex scenarios like sparse smoke, weak flames, or vegetation occlusion, and their high computational costs hinder practical deployment. To cope with it, this paper introduces F3-YOLO, a robust and fast forest fire detection model based on YOLOv12. F3-YOLO introduces conditionally parameterized convolution (CondConv) to enhance representational capacity without incurring a substantial increase in computational cost, improving fire detection in complex backgrounds. Additionally, a frequency domain-based self-attention solver (FSAS) is integrated to combine high-frequency and high-contrast information, thus better handling real-world detection scenarios involving both small distant targets in aerial imagery and large nearby targets on the ground. To provide more stable structural cues, we propose the Focaler Minimum Point Distance Intersection over Union Loss (FMPDIoU), which helps the model capture irregular and blurred boundaries caused by vegetation occlusion or flame jitter and smoke dispersion. To enable efficient deployment on edge devices, we also apply structured pruning to reduce computational overhead. Compared to YOLOv12 and other mainstream methods, F3-YOLO achieves superior accuracy and robustness, attaining the highest mAP@50 of 68.5% among all compared methods on the dataset while requiring only 5.4 GFLOPs of computational cost and maintaining a compact parameter count of 2.6 M, demonstrating exceptional efficiency and effectiveness. These attributes make it a reliable, low-latency solution well-suited for real-time forest fire early warning systems.