The aim of this report is to perform some analysis on the last 6 year’s debit/credit card expenditure data which was obtained from the Central Bank of the Republic of Turkey. The source code can be found here.
The following are the datasets that will be used to provide insights:
The main goal is to examine the behaviour of the total credit/debit card expenditure amount with the help of other datasets. (All of these data belongs to the Central Bank of the Republic of Turkey.)
Preparation part consists of installing the required packages, reading and taking a first look at the data.
# including the required packages
library(readxl)
library(dplyr)
library(ggplot2)
library(lubridate)
library(xts)
# reading the data
total_expenditure <- read_xlsx('totalcardexpenditure.xlsx')
num_transactions <- read_xlsx('totalnumoftransactions.xlsx')
usd_try <- read_xlsx('usdcurrency.xlsx')
turkey_cpi <- read_xlsx('turkeycpi.xlsx')
head(total_expenditure)
## # A tibble: 6 x 2
## Tarih `TP KKHARTUT KT1`
## <chr> <dbl>
## 1 03-01-2014 NA
## 2 10-01-2014 NA
## 3 17-01-2014 NA
## 4 24-01-2014 NA
## 5 31-01-2014 NA
## 6 07-02-2014 NA
summary(total_expenditure)
## Tarih TP KKHARTUT KT1
## Length:328 Min. : 7070685
## Class :character 1st Qu.:10779142
## Mode :character Median :12803828
## Mean :13578570
## 3rd Qu.:16180676
## Max. :24369424
## NA's :9
head(num_transactions)
## # A tibble: 6 x 3
## Tarih `TP KKISLADE KA1` ...3
## <chr> <chr> <chr>
## 1 03-01-2014 <NA> <NA>
## 2 10-01-2014 <NA> <NA>
## 3 17-01-2014 <NA> <NA>
## 4 24-01-2014 <NA> <NA>
## 5 31-01-2014 <NA> <NA>
## 6 07-02-2014 <NA> <NA>
summary(num_transactions)
## Tarih TP KKISLADE KA1 ...3
## Length:336 Length:336 Length:336
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
head(usd_try)
## # A tibble: 6 x 3
## Tarih `TP DK USD A YTL` ...3
## <chr> <dbl> <lgl>
## 1 01-01-2014 NA NA
## 2 02-01-2014 2.13 NA
## 3 03-01-2014 2.17 NA
## 4 04-01-2014 NA NA
## 5 05-01-2014 NA NA
## 6 06-01-2014 2.17 NA
summary(usd_try)
## Tarih TP DK USD A YTL ...3
## Length:2307 Min. :2.071 Mode:logical
## Class :character 1st Qu.:2.767 NA's:2307
## Mode :character Median :3.522
## Mean :3.796
## 3rd Qu.:5.281
## Max. :6.907
## NA's :722
head(turkey_cpi)
## # A tibble: 6 x 3
## Tarih `TP FG J0` ...3
## <chr> <chr> <chr>
## 1 2014-01 233.54 <NA>
## 2 2014-02 234.54 <NA>
## 3 2014-03 237.18 <NA>
## 4 2014-04 240.37 <NA>
## 5 2014-05 241.32 <NA>
## 6 2014-06 242.07 <NA>
summary(turkey_cpi)
## Tarih TP FG J0 ...3
## Length:87 Length:87 Length:87
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
Here are the problems that draws the attention after the first look into the data:
There are some missing values in total_expenditure
and num_transactions
datasets at the beginning because their recordings began on 2014-03-07.
There are some unnecessary columns in 3 of these datasets.
Most of the columns have nonsense names and again most them are not in the right data type.
USD-TRY exchange rate data have NA
values at the weekends.
To fix these problems some data manipulations have to be done.
Data manipulations for each dataset can be found below.
total_expenditure <- total_expenditure %>%
rename(Date = 'Tarih', Amount = 'TP KKHARTUT KT1') %>%
filter(!is.na(Date) & !is.na(Amount))
total_expenditure$Date <- as.Date(total_expenditure$Date, format='%d-%m-%Y')
head(total_expenditure)
## # A tibble: 6 x 2
## Date Amount
## <date> <dbl>
## 1 2014-03-07 8004400
## 2 2014-03-14 8650779
## 3 2014-03-21 8501345
## 4 2014-03-28 8559196
## 5 2014-04-04 8176410
## 6 2014-04-11 8771625
num_transactions <- num_transactions %>%
rename(Date = 'Tarih', Number = 'TP KKISLADE KA1') %>%
select(Date, Number) %>%
filter(!is.na(Date) & !is.na(Number)) %>%
slice(1:319)
num_transactions$Date <- as.Date(num_transactions$Date, format='%d-%m-%Y')
num_transactions$Number <- as.numeric(num_transactions$Number)
head(num_transactions)
## # A tibble: 6 x 2
## Date Number
## <date> <dbl>
## 1 2014-03-07 65962973
## 2 2014-03-14 69238515
## 3 2014-03-21 71510062
## 4 2014-03-28 71276145
## 5 2014-04-04 66492495
## 6 2014-04-11 70694670
usd_try <- usd_try %>%
rename(Date = 'Tarih', Currency = 'TP DK USD A YTL') %>%
select(Date, Currency) %>%
slice(2:2299) %>%
mutate(USDTRY = na.locf(Currency)) %>%
select(Date, USDTRY)
usd_try$Date <- as.Date(usd_try$Date, format='%d-%m-%Y')
head(usd_try)
## # A tibble: 6 x 2
## Date USDTRY
## <date> <dbl>
## 1 2014-01-02 2.13
## 2 2014-01-03 2.17
## 3 2014-01-04 2.17
## 4 2014-01-05 2.17
## 5 2014-01-06 2.17
## 6 2014-01-07 2.19
turkey_cpi <- turkey_cpi %>%
rename(Date = 'Tarih', CPI = 'TP FG J0') %>%
select(Date, CPI) %>%
slice(1:75)
turkey_cpi$Date <- as.yearmon(turkey_cpi$Date, format='%Y-%m')
turkey_cpi$CPI <- as.numeric(turkey_cpi$CPI)
head(turkey_cpi)
## # A tibble: 6 x 2
## Date CPI
## <yearmon> <dbl>
## 1 Jan 2014 234.
## 2 Feb 2014 235.
## 3 Mar 2014 237.
## 4 Apr 2014 240.
## 5 May 2014 241.
## 6 Jun 2014 242.
At this moment datasets are ready to be used for the analysis.
The first question that can be answered in this analysis is that whether there is a positive correlation between the amount spent and the number of transactions. Firstly, these two variables have to be plotted together. To achieve that these two data frames should be merged.
amount_with_transactions <- merge(total_expenditure, num_transactions, by = 'Date')
amount_with_transactions <- amount_with_transactions %>%
mutate(Amount_Per_Transaction = Amount * 1000 / Number)
There is also a new column called Amount_Per_Transaction
which represents the average amount spent per transaction.
This data frame should be converted to an xts object.
amount_transaction_xts <- xts(select(amount_with_transactions, -c(Date)),
order.by = amount_with_transactions$Date)
Now, the the object is ready to be plotted.
By looking at the Figure 1, one can conclude that both of these variables tend to have a positive trend throughout this period. The correlation between these two variables are:
cor(amount_transaction_xts$Amount, amount_transaction_xts$Number)
## Number
## Amount 0.9433496
This value shows that there is a highly positive correlation between these two.
One last interesting insight from this part is to check whether average amount spent per transaction is increasing or not. The plot for this variable:
By looking at the Figure 2, there seems to be an upwards trend over time.
Firstly, xts objects corresponding to total expenditure amount and USD-TRY exchange rate datasets should be created.
expenditure_xts <- xts(select(total_expenditure, Amount), order.by = total_expenditure$Date)
usdtry_xts <- xts(select(usd_try, USDTRY), order.by = usd_try$Date)
To have a look at these objects,
head(expenditure_xts)
## Amount
## 2014-03-07 8004400
## 2014-03-14 8650779
## 2014-03-21 8501345
## 2014-03-28 8559196
## 2014-04-04 8176410
## 2014-04-11 8771625
head(usdtry_xts)
## USDTRY
## 2014-01-02 2.1304
## 2014-01-03 2.1718
## 2014-01-04 2.1718
## 2014-01-05 2.1718
## 2014-01-06 2.1687
## 2014-01-07 2.1878
As one can see, expenditure_xts
is a weekly data however usdtry_xts
is a daily one. To be able to merge these objects, they have to have the same indices. To achieve that daily data can be averaged for each week,
index(expenditure_xts) <- index(expenditure_xts) + 2
usdtry_xts <- usdtry_xts['2014-03-07/2020-04-10']
ep <- endpoints(usdtry_xts, on='weeks')
usdtry_weekly_xts <- period.apply(usdtry_xts, INDEX = ep, FUN = mean)
One final look to objects,
head(expenditure_xts)
## Amount
## 2014-03-09 8004400
## 2014-03-16 8650779
## 2014-03-23 8501345
## 2014-03-30 8559196
## 2014-04-06 8176410
## 2014-04-13 8771625
head(usdtry_xts)
## USDTRY
## 2014-03-07 2.1999
## 2014-03-08 2.1999
## 2014-03-09 2.1999
## 2014-03-10 2.1873
## 2014-03-11 2.2118
## 2014-03-12 2.2220
So, these objects are now ready to be merged.
amount_with_exchange <- merge(expenditure_xts, usdtry_weekly_xts, join = 'inner')
Final table shapes up like this,
head(amount_with_exchange)
## Amount USDTRY
## 2014-03-09 8004400 2.199900
## 2014-03-16 8650779 2.220029
## 2014-03-23 8501345 2.229900
## 2014-03-30 8559196 2.212800
## 2014-04-06 8176410 2.147471
## 2014-04-13 8771625 2.107186
Since these two variables differ so much from each other, it would not be an healthy approach to plot them in the same figure. However, correlation can be examined.
cor(amount_with_exchange$Amount, amount_with_exchange$USDTRY)
## USDTRY
## Amount 0.9123946
Again, these two variables seems to be highly positively correlated.
Object of interest in this part is to examine the relationship between the credit/debit card total expenditure amount and consumer price index of Turkey. Since CPI is a monthly data, expenditure amount should also be converted into monthly data points.
# creating monthly expenditure xts object
expenditure_monthly <- apply.monthly(expenditure_xts, FUN = mean, indexAt='yearmon')
index(expenditure_monthly) <- as.yearmon(index(expenditure_monthly))
# creating consumer price index xts object
cpi_xts <- xts(select(turkey_cpi, CPI), order.by = turkey_cpi$Date)
The first look into the data,
head(expenditure_monthly)
## Amount
## Mar 2014 8428930
## Apr 2014 8555397
## May 2014 8900711
## Jun 2014 9432631
## Jul 2014 9415898
## Aug 2014 9044885
head(cpi_xts)
## CPI
## Jan 2014 233.54
## Feb 2014 234.54
## Mar 2014 237.18
## Apr 2014 240.37
## May 2014 241.32
## Jun 2014 242.07
Since these objects have different starting and ending points, they both have to trimmed.
expenditure_monthly <- expenditure_monthly['2014-03-01/2020-03-31']
cpi_xts <- cpi_xts['2014-03-01/2020-03-31']
Now these xts objects are ready to be merged.
amount_with_cpi <- merge(expenditure_monthly, cpi_xts)
Final state of the table:
head(amount_with_cpi)
## Amount CPI
## Mar 2014 8428930 237.18
## Apr 2014 8555397 240.37
## May 2014 8900711 241.32
## Jun 2014 9432631 242.07
## Jul 2014 9415898 243.17
## Aug 2014 9044885 243.40
Again, since the values of these two variables differ by a great amount, it would not make much sense to plot them in the same graph. However, correlation should be calculated:
cor(amount_with_cpi$Amount, amount_with_cpi$CPI)
## CPI
## Amount 0.9741323
It seems like there is also a highly positive correlation between credit/debit card expenditure amount and consumer price index.
Credit/debit card total expenditure amount over the last 6 years is highly positively correlated with the number of transactions made as expected.
Average money spent per each credit/debit card transaction has had an upwards trend in the last 6 years.
There is a positive relationship between credit/debit card total expenditure amount and USD-TRY exchange rate.
Consumer price index has also positively affected credit/debit card total expenditure amount over the last 6 years.