A git repo to store all of the files for task-3 of GRIP internship by TSF
This project is maintained by SuhruthY
Despite various strategies performed by Security and Intelligence agencies throughout the world, terrorist activities are still a millennium challenge. Recently, data-mining techniques have evolved to allow the identification of patterns and associations in criminals.
GTD has recently released its new version that includes all data about terrorist attacks till 2019, including the year 1993(as a separate file). The GTD codebook provides detailed documentation about the data and its inclusion criteria. I have used SQLite to subset the data and then perform various cleaning and preprocessing using programming languages such as Python and R programming. I then used Tableau, a data visualization tool, to join the subsets and explore the data.
After working on this project, one can dive deeper into each feature and explore its relations. We can perform various statistical analyses to understand, discover and predict the upcoming terrorist attacks.
Fatalities(number of deaths) and Casualties(number of deaths and injuries) are the key issues I dealt with in this project. Correlation factors that influence terrorist attacks by ISSST, Country-level terrorism trends, Identification of subgroups to prevent mass casualties are some of the research topics that I used. Most of these focus on different statistical methods and machine learning approaches to detect terrorist activities in depth.
The background report 2019 provided by the START was my starting point to understand the GTD as a whole. It showcases an intuitive way of understanding the GTI score. Various regional trends and trends on terrorist groups are studied.
The GTD defines a terrorist attack as the threatened or actual use of illegal force and violence by a non-state actor to attain a political, economic, religious, or social goal through fear, coercion, or intimidation. To consider an incident for inclusion in the GTD, all three of the following attributes must be present:
In addition, at least two of the following three criteria must be present
Some things to be noted while working
Check out my take on finding GTI score here: Calculating Global Terrorism Index
Dividing the project into segments and smaller, it aims to
In the first part, we explore the geospatial trends with GTI scores in our minds. We will also find various global KPIs that can be derived. Then in the next part, we will differentiate terrorism between two consecutive years. We will also study the top most influential terrorist groups over the years. The last part tries to unearth the hidden patterns of the textual data and correlation between categories.
Used Correspondence Analysis to find out the correlation between the categorical variables and fatality. A new feature of factors very-low, low, high, very-high is created by the number of deaths. For each variable in the study, I made a frequency table of fatality levels. Then the distance measure is calculated to obtain the final contributions. After that, implemented Singular Value Decomposition to deduce two principal components that explain more than 90% of the variance.
Iterative topic modeling is performed with the textual data to understand the term frequency and document the frequency of the top words in each year.
You can find out the detailed explanation in Exploring the Global Terrosism Data: The bakend
In summary, we have derived a potential metric to quantify terrorism, observed the similarities and dissimilarities between consecutive years, unleashed patterns through different statistical methods.
This project could be a starting point to understand the in-detailed correlations between various features in GTD. You can study latent class growth modeling to find chronological patterns, clustering based on different methodologies to groups the events, analyzing killing ranges, understanding the origin and activity of terrorist organizations.
Also, note that we have been working with only the GTD database throughout the project. Work on various other terrorism databases such as RAND, MIPT Terrorism Knowledge Base, Worldwide incidents Tracking Systems, Tocsearch, etc.