GRIP_Task3

A git repo to store all of the files for task-3 of GRIP internship by TSF

This project is maintained by SuhruthY

Exploring the GLobal Terrorism Data: Tableau Public

Overview

 Despite various strategies performed by Security and Intelligence agencies throughout the world, terrorist activities are still a millennium challenge. Recently, data-mining techniques have evolved to allow the identification of patterns and associations in criminals.

 GTD has recently released its new version that includes all data about terrorist attacks till 2019, including the year 1993(as a separate file). The GTD codebook provides detailed documentation about the data and its inclusion criteria. I have used SQLite to subset the data and then perform various cleaning and preprocessing using programming languages such as Python and R programming. I then used Tableau, a data visualization tool, to join the subsets and explore the data.

 After working on this project, one can dive deeper into each feature and explore its relations. We can perform various statistical analyses to understand, discover and predict the upcoming terrorist attacks.

Literature Survey

 Fatalities(number of deaths) and Casualties(number of deaths and injuries) are the key issues I dealt with in this project. Correlation factors that influence terrorist attacks by ISSST, Country-level terrorism trends, Identification of subgroups to prevent mass casualties are some of the research topics that I used. Most of these focus on different statistical methods and machine learning approaches to detect terrorist activities in depth.

 The background report 2019 provided by the START was my starting point to understand the GTD as a whole. It showcases an intuitive way of understanding the GTI score. Various regional trends and trends on terrorist groups are studied.

Procedure

Global Terrorism Database

 The GTD defines a terrorist attack as the threatened or actual use of illegal force and violence by a non-state actor to attain a political, economic, religious, or social goal through fear, coercion, or intimidation. To consider an incident for inclusion in the GTD, all three of the following attributes must be present:

 In addition, at least two of the following three criteria must be present

Some things to be noted while working

 Check out my take on finding GTI score here: Calculating Global Terrorism Index

3-Phase Model

Dividing the project into segments and smaller, it aims to

 In the first part, we explore the geospatial trends with GTI scores in our minds. We will also find various global KPIs that can be derived. Then in the next part, we will differentiate terrorism between two consecutive years. We will also study the top most influential terrorist groups over the years. The last part tries to unearth the hidden patterns of the textual data and correlation between categories.

Statistical Techniques

 Used Correspondence Analysis to find out the correlation between the categorical variables and fatality. A new feature of factors very-low, low, high, very-high is created by the number of deaths. For each variable in the study, I made a frequency table of fatality levels. Then the distance measure is calculated to obtain the final contributions. After that, implemented Singular Value Decomposition to deduce two principal components that explain more than 90% of the variance.

 Iterative topic modeling is performed with the textual data to understand the term frequency and document the frequency of the top words in each year.

 You can find out the detailed explanation in Exploring the Global Terrosism Data: The bakend

Conclusion & Future Scope

 In summary, we have derived a potential metric to quantify terrorism, observed the similarities and dissimilarities between consecutive years, unleashed patterns through different statistical methods.

 This project could be a starting point to understand the in-detailed correlations between various features in GTD. You can study latent class growth modeling to find chronological patterns, clustering based on different methodologies to groups the events, analyzing killing ranges, understanding the origin and activity of terrorist organizations.

 Also, note that we have been working with only the GTD database throughout the project. Work on various other terrorism databases such as RAND, MIPT Terrorism Knowledge Base, Worldwide incidents Tracking Systems, Tocsearch, etc.

References