PHISHING WEBSITE DETECTION USING ANT COLONY OPTIMIZATION

Phishing is a form of social engineering in which an attacker, also known as a phisher, attempts to fraudulently retrieve legitimate users’ confidential or sensitive credentials by mimicking electronic communications from a trustworthy or public organization in an automated fashion. The word “phishing” appeared around 1995, when Internet scammers were using email lures to “fish” for passwords and financial information from the sea of Internet users; “ph” is the common hacker replacement of “f”, which comes from the original form of hacking, “phreaking” on telephone switches during the 1960s. Early phishers copied the code from the AOL website and crafted pages that looked like they were a part of AOL, and sent spoofed emails or instant messages with a link to this fake web page, asking potential victims to reveal their passwords. The method based on available features on URL and page contents without using the search engines such Google sets, to detect the phishing websites where our methodology target to extract the most number of features exist in literature then find the robust features that are not affected by concept drift this is to answer the question are their features can give the required accuracy when the training and testing data come from different times? as the phishers change their tactics from time to time. After finding such features using Ant Colony Optimization, to examine the performance and by applying classifier using Artificial Neural Network(ANN), Support Vector Machine(SVM) and Treefit Algorithm to decide which one give us the best performance. The performance analysis has to be done using software simulation such as the Accuracy, Sensitivity and Selectivity and all parameters related to examining the performance using Matlab.

Date set collection and Pre-processing

Data sets should be implemented as shown in Figure, which shows the whole data set collection and pre-processing process, the phishing websites collected from PhishTank website in CSV format.

After generating the data sets required features given below,

Features:

1. having_IP_Address { 1,0 }

2. URL_Length { 1,0,-1 }

3. Shortining_Service { 0,1 }

4. having_At_Symbol { 0,1 }

5. double_slash_redirecting { 1,0 }

6. Prefix_Suffix { -1,0,1 }

7. having_Sub_Domain {

8. SSLfinal_State { -1,1,0 }

9. Domain_registeration_length { 0,1,

10. Favicon { 0,1 }

11. port { 0,1 }

12. HTTPS_token { 1,0 }

13. Request_URL { 1,-1 }

14. URL_of_Anchor { -1,0,1 }

15. Links_in_tags { 1,-1,0 }

16. SFH { -1,1 }

17. Submitting_to_email { 1,0 }

18. Abnormal_URL { 1,0 }

19. Redirect { 0,1 }

20. on_mouseover { 0,1 }

21. RightClick { 0,1 }

22. popUpWidnow { 0,1 }

23. Iframe { 0,1 }

24. age_of_domain { -1,0,1 }

25. DNSRecord { 1,0 }

26. web_traffic { -1,0,1 }

27. Page_Rank { -1,0,1 }

28. Google_Index { 0,1 }

29. Links_pointing_to_page { 1,0,-1 }

30. Statistical_report { 1,0 }

Reference Paper-1: A Review of Exposure and Avoidance Techniques for Phishing Attack

Author’s Name: Kanchan Meena and Tushar Kanti

Source: International Journal of Computer Applications

Year: 2014

Reference Paper-2: A Review on Phishing Attacks

Author’s Name: Akarshita Shankar ,Ramesh Shetty and Badari Nath K

Source: International Journal of Applied Engineering Research

Year: 2019

Request source code for academic purpose, fill REQUEST FORM or contact +91 7904568456 by WhatsApp or info@verilogcourseteam.com, fee applicable.

SIMULATION VIDEO DEMO

You can DOWNLOAD data-set details and reference papers.

PREVIOUS PAGE|NEXT PAGE