Post

Commonwealth Bank - Cyber Fraud Detection with Splunk

Commonwealth Bank - Cyber Fraud Detection with Splunk

As a cybersecurity generalist at Commonwealth Bank, you play a key role in defending against the growing threat of financial fraud.

Financial fraud poses a significant challenge for any financial institution, and it is important for Commonwealth Bank to stay up to date with the latest fraud detection technologies and strategies to minimize risk. Protecting against and responding to fraud is a major responsibility for you and your team. By detecting and stopping fraud, the bank can protect its customers, employees and reputation while also enhancing the resilience of its financial system.

During this cybersecurity job simulation, you’ll gain hands-on experience analyzing data, detecting fraud, and building dashboards to support decision-making. I strongly recommend trying this lab out anytime here.

Scenario

Your manager has asked you and the data analytics team to create a model that will predict or detect fraud using customer data from Commonwealth Bank. This model will help identify fraud more accurately and efficiently, while also protecting customers, employees and bank’s reputation.

You’ll use Splunk to build a dashboard that makes it easy to identify patterns and trends given a dataset. The dashboard should provide crucial reporting and metrics information that can help in identifying and detecting financial fraud. By using this dashboard, the team will be able to quickly spot any suspicious activity and take action to prevent fraud from occurring.

Your Splunk dashboard must include the following charts/tables:

  • Total number of Transactions.
  • Fraudulent vs. Non-Fraudulent Transactions.
  • Number of Transactions by Transaction Category.
  • Number of Transactions by Fraudulent Payments.
  • Number of Transactions by Age.
  • Number of Transactions by Merchant.
  • Fraud Cases by Transaction Category.
  • Fraud Cases by Age.
  • Fraud Cases by Month.
  • Fraud Cases by Gender.
  • Fraud Cases by Merchant.
  • Gender with the most fraudulent activity by Transaction Category.
  • Age group with the most fraudulent activity by Merchant.

About the Dataset

The dataset you’re going to work with has been carefully curated by the Fraud Team. It contains payment records from multiple customers, made at different times and for various amounts. Each record includes the following features:

Desktop View

Here’s a small sample of the dataset:

Desktop View

The assignment wants you to analyze fraud patterns. Each visualization will help reveal who, where, and when fraud happens.

Transaction Metrics

This overall section establishes an analytical baseline, summarizing total transaction volumes and their distribution across transaction categories, age groups, and merchants. These metrics will provide the context needed to identify behavioral patterns, detect anomalies, and compare legitimate activity versus fraudulent transactions in subsequent analyses.

Total Number of Transactions

This visualization shows you the total number of transactions.

Desktop View

Here’s the SPL query used:

1
2
index="main" sourcetype="fraud_detection.csv" 
| stats count as "Total Transactions"

Fraudulent vs. Non-Fraudulent Transactions

This visualization compares fraudulent vs. non-fraudulent transactions, making it easy to see the overall proportion of fraud. This is often used as one of the first steps in a fraud detection dashboard to give a quick look of fraud levels.

Desktop View

Here’s the SPL query used:

1
2
3
4
index="main" sourcetype="fraud_detection.csv"
| eval status = if(fraud=1, "Fraudulent", "Non-Fraudulent")
| stats count AS Transactions by status
| rename status AS "Transaction Type", Transactions AS "Count"

Number of Transactions by Transaction Category

This chart shows you the total number of transactions for each transaction category. It provides a clear view of which types of transactions are most common.

Desktop View

Here’s the SPL query used:

1
2
index="main" sourcetype="fraud_detection.csv" 
| stats count as "Number of Transactions" by category

Number of Transactions by Fraudulent Payments

The following chart shows you how many transactions are fraudulent vs. non-fraudulent.

Desktop View

Here’s the SPL query used:

1
2
index="main" sourcetype="fraud_detection.csv" 
| stats count as "Number of Transactions" by fraud

Number of Transactions by Age

The following visualization shows you the number of transactions distributed across different age groups. You can see which age groups are more active than others, this way you can understand customer behavior by age.

Desktop View

Here’s the SPL query used:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
index="main" sourcetype="fraud_detection.csv"
| eval AgeOrder = case(
    age==0, 0,
    age==1, 1,
    age==2, 2,
    age==3, 3,
    age==4, 4,
    age==5, 5
  ),
  AgeRange = case(
    age==0, "<=18",
    age==1, "19-25",
    age==2, "26-35",
    age==3, "36-45",
    age==4, "46-55",
    age==5, "56-65"
  )
| stats count as "Number of Transactions" by AgeRange AgeOrder
| sort AgeOrder
| fields - AgeOrder
| rename AgeRange as age

Number of Transactions by Merchant

This visualization shows the number of transactions for each merchant.

Desktop View

Here’s the SPL query used:

1
2
index="main" sourcetype="fraud_detection.csv" 
| stats count as "Number of Transactions" by merchant

Fraud Detection Metrics

This section focuses on fraudulent activity distributed across transaction categories, age groups, months, genders, and merchants. Each visualization will help reveal who and where fraud occurs, this way the bank will be able to identify risk patterns, correlations, and areas that may require stronger detection or preventative measures.

Fraud Cases by Transaction Category

This chart shows you the total number of transactions and the number of fraudulent transactions in each category. It helps identify which transaction categories have higher instances of fraud.

Desktop View

Here’s the SPL query used:

1
2
3
index="main" sourcetype="fraud_detection.csv" 
| stats count as "Total Transactions", 
        count(eval(fraud=1)) as "Fraudulent Transactions" by category

Fraud Cases by Age

This chart shows you the total number of transactions and the number of fraudulent transactions for each age range. It helps identify which age groups have higher fraudulent activity.

Desktop View

Here’s the SPL query used:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
index="main" sourcetype="fraud_detection.csv"
| eval AgeOrder = case(
    age==0, 0,
    age==1, 1,
    age==2, 2,
    age==3, 3,
    age==4, 4,
    age==5, 5
  ),
  AgeRange = case(
    age==0, "<=18",
    age==1, "19-25",
    age==2, "26-35",
    age==3, "36-45",
    age==4, "46-55",
    age==5, "56-65"
  )
| stats count as "Total Transactions", count(eval(fraud=1)) AS "Fraudulent Transactions" by AgeRange AgeOrder
| sort AgeOrder
| fields - AgeOrder
| rename AgeRange as age

Fraud Cases by Month

The following chart shows you the number of fraudulent transactions per month. It helps the bank to see which months experience higher numbers of fraudulent transactions.

Desktop View

Here’s the SPL query used:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
index="main" sourcetype="fraud_detection.csv" 
| eval month = case(
    step==0, "May",
    step==1, "June",
    step==2, "July",
    step==3, "August"
  )
| stats count(eval(fraud=1)) as "Fraudulent Transactions" by month
| eval month_order = case(
    month=="May",1,
    month=="June",2,
    month=="July",3,
    month=="August",4 
 )
| sort month_order
| fields - month_order

Fraud Cases by Gender

This visualization shows you both the total number of transactions and the number of fraudulent transactions for each gender. It helps the bank understand which genders are associated with higher levels of fraudulent activity.

Desktop View

Here’s the SPL query used:

1
2
index="main" sourcetype="fraud_detection.csv"
| stats count as "Total Transactions", count(eval(fraud=1)) as "Fraudulent Transactions" by gender

Fraud Cases by Merchant

This chart shows you the total number of transactions and the number of fraudulent transactions for each merchant. This helps the bank identify which merchants experience significant fraud activity.

Desktop View

Here’s the SPL query used:

1
2
index="main" sourcetype="fraud_detection.csv" 
| stats count as "Total Transactions", count(eval(fraud=1)) AS "Fraudulent Transactions" by merchant

Gender with the most fraudulent activity by Transaction Category

The purpose of this visualization is to identify which gender has the most fraudulent activity within each transaction category.

Desktop View

Here’s the SPL query used:

1
2
index="main" sourcetype="fraud_detection.csv" 
| chart count(eval(fraud=1)) as "Fraud Cases" by category, gender

Age group with the most fraudulent activity by Merchant

The purpose of this visualization is to identify the age group with the most fraudulent activity for each merchant.

Desktop View

Here’s the SPL query used:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
index="main" sourcetype="fraud_detection.csv"
| eval AgeOrder = case(
    age==0, 0,
    age==1, 1,
    age==2, 2,
    age==3, 3,
    age==4, 4,
    age==5, 5
  ),
  AgeRange = case(
    age==0, "<=18",
    age==1, "19-25",
    age==2, "26-35",
    age==3, "36-45",
    age==4, "46-55",
    age==5, "56-65"
  )
| stats count(eval(fraud=1)) as "Fraudulent Transactions" by merchant AgeRange AgeOrder
| sort 0 merchant - "Fraudulent Transactions"
| dedup merchant
| sort AgeOrder
| eval chartLabel = merchant . " (" . AgeRange . ")"
| fields chartLabel "Fraudulent Transactions"

Final Dashboard

Armed with this dashboard, the bank can now spot unusual trends, recognize fraud patterns, address weak spots, and improve its overall security posture.

Desktop View

This post is licensed under CC BY 4.0 by the author.