|
|
Noise Analysis for Text-based Spam Images |
Peng Li, Hanbing Yan, Gang Cui, Yuejin Du |
|
|
Abstract Traditional spam filters are facing more and more challenges with
the rapid growth of image-based spam. Previous works have leveraged
OCR techniques and text classifiers for image spam detection, which
are time consuming and CPU intensive. In addition, OCR can be
easily tricked by noise- and content-obscuring elements added by
spammers. In this paper, we propose a novel approach aimed at
detecting the ``{\it amount}'' and the ``{\it type}'' of noise due
to the use of those techniques against OCR tools. Firstly, we
propose a specific method for text region localization using
steerable filter and morphological processing, which separates
images into text-regions for OCR content extraction and
background-regions for noise analysis. Next, wavelet transform is
used for constructing noise feature image of the background-region,
based on which noise measurement and classification can be
completed. Experimental results show that our method can locate the
text region accurately, and the results of noise analysis can
effectively reflect the noise interference of spam images, which
can be viewed as complementary to the approaches based on OCR tools
for further reducing false positives of the image spam filters.
|
|
|
|
|
[1] |
Lingxiao Ma;Yi Li;Hancong Tang;Weilai Chi;Depeng Dang. Parallel Chameleon Clustering Based on MapReduce[J]. , 2015, 12(6): 2053-2062. |
[2] |
Jun Chen;Zhengyang Luo;Chengying Gao. An Improved Hole-filling Technology Based on MLS[J]. , 2015, 12(6): 2063-2072. |
[3] |
Weixin Xie;Hongbin Huang;Haotian Zhai;Weiping Liu. Features Extraction and Classification of Rice Paper Images Based on Wavelet Transform[J]. , 2015, 12(6): 2073-2079. |
[4] |
Yehong Du;He Cui;Bing Li;Jie Li. Research on Regional Coverage with LAVs Based on MOPSOA[J]. , 2015, 12(6): 2081-2092. |
[5] |
Xiaojian You;Xiaohai He;Xuemei Han;Chun Wu;Hong Jiang. A Novel Cognitive Radio Decision Engine Based on Chaotic Quantum Bee Colony Algorithm[J]. , 2015, 12(6): 2093-2106. |
[6] |
Yao Fan;Yanli Chu. pplication of Improved ART Algorithm in Concrete Ultrasonic Imaging[J]. , 2015, 12(6): 2107-2116. |
[7] |
Lun Xie;Xin Liu;Zhiliang Wang. Micro-expression Cognition and Emotion Modeling Based on Gross Reappraisal Strategy[J]. , 2015, 12(6): 2117-2132. |
[8] |
Xiaoxue Guo;Haosen Lin. Remain Resource Reallocation DRA Algorithm with Multiple QoS Parameters Constraint[J]. , 2015, 12(6): 2133-2141. |
[9] |
Zhiwei Ni;Xuhui Zhu;Liping Ni;Meiying Cheng;Yiling Wang. An Improved Discrete Optimization Algorithm Based on Artificial Fish Swarm and Its Application for Attribute Reduction[J]. , 2015, 12(6): 2143-2154. |
[10] |
Siyuan Liu;Meng Wang;Haosong Hu. A Method and Application of Signal Demodulation Based on Wavelet Packet and Wavelet Ridge Decomposition[J]. , 2015, 12(6): 2155-2164. |
[11] |
Yuan Xi;Kai Cheng;Tao Xiao;Xitong Lou;Lei Cheng;Yanjuan Hu. Parametric Design of Reverse Blowing Pickup Mouth Based on Flow Simulation[J]. , 2015, 12(6): 2165-2175. |
[12] |
Hui Zhang;Peng Zhao;Jian Gao;Chengxiang Zhuge;Xiangming Yao. An Effective Intelligent Method for Optimal Urban Transit Network Design[J]. , 2015, 12(6): 2177-2184. |
[13] |
Xin Wang;Fulian Yin;Jianping Chai;Xinran Wang. The Research of Broadcast Television Community Discovery Technology Based on Double-weight Gaussian Kernel Similarity[J]. , 2015, 12(6): 2185-2196. |
[14] |
Yanli Huang;Beibei Xu;Xiaoliang Li. Properties of Rational General Solutions for First Order Multivariate Autonomous Rational Differential Systems[J]. , 2015, 12(6): 2197-2204. |
[15] |
Jianli Feng;Xiaomin Zhang. An Identification Algorithm of Passive Millimeter Wave Detection Armored Targets Based on Signal Complexity[J]. , 2015, 12(6): 2205-2212. |
|
|
|
|