top of page
WeChat截图_20231122155503.png

Credit Risk Analysis

Date Update: 2023 July 1st 

The purpose of this project is to helps our financial institution to make informed data-driven decisions about our loan product, risk monitoring and insights from the dataset.

ABOUT THIS PROJECT

Business Objective

The purpose of this project is to helps our financial institution to make informed data-driven decisions about our loan product, risk monitoring and insights from the dataset.

Machine Learning

To enhance comprehension of historical data, I employed Python to experiment with various models including Random Forest, K Neighbors, XGBoost, and Gaussian. However, the model that demonstrated superior performance and was ultimately chosen was Logistic Regression. Through the analysis of the result summaries of these models, we can pinpoint key credit risk indicator metrics by examining their coefficients. A higher coefficient value signals a higher risk, whereas a lower value suggests a lower risk. This insight enables us to make informed decisions about the credit risk levels of new clients.

Data Analytics & Dashboards

I have methodically segmented the analysis into three key sections to ensure depth and clarity in these insights. 

Product and Action-Based Recommendations

Focuses on providing specific, actionable insights related to various products and services, facilitating informed decision-making and strategic planning.

Data-Driven Recommendations

Concentrates on deriving insights directly from the dataset using advanced visualizations to reveal patterns, trends, and outliers, offering a data-centric perspective for formulating recommendations.

Comprehensive Dashboard and Predictive Analytics

Integrates insights from both product-action and data-driven analyses, presenting a unified dashboard view. This segment includes predictive analytics, synthesizing current data trends, and offering forward-looking insights to enhance strategic foresight in decision-making.

Result

The data analytics strategy encompasses targeted marketing for different age groups, financial education for younger clients, and tailored financial planning for mature clients. Incentives are proposed to enhance savings practices, and diversification of loan portfolios is recommended to mitigate risks associated with specific market sectors. Additionally, the use of radar graph analysis will aid in balancing loan distribution, aligning it with broader risk management objectives and market trends.

INSIGHTS

Credit Risk Trends Among Young Clients: Age and Savings as Key Indicators

The analysis indicates a distinct trend in credit risk profiles among clients. Notably, clients who are students under the age of 25 exhibit a higher credit risk, which progressively diminishes with age. Additionally, it is noteworthy that approximately 80% of clients maintain savings of less than $1000 in their accounts, a figure that falls into the lower bracket of savings. This data is crucial for developing a nuanced understanding of credit risk factors and tailoring risk management strategies accordingly.

​

Clients categorized as low risk typically include those who have been customers for less than 12 months, have obtained loans for used cars, and own a house. This observation suggests a hypothesis: the duration of a customer's relationship with the bank may correlate with credit risk. Longer-term customers might have a more extensive credit history, which could reveal a higher likelihood of default or delayed payments. Conversely, newer customers, who may have opened accounts recently to address financial issues, could represent a lower credit risk. However, it is important to note that the available historical data is not comprehensive and lacks granular details, necessitating a cautious interpretation of these trends. This limitation underscores the need for further data acquisition and analysis to substantiate these initial findings.

WeChat截图_20231122150714.png

Strategic Recommendations for Enhanced Client Engagement and Portfolio Diversification 

  1. Targeted Marketing Strategy:

    • Develop marketing campaigns tailored to specific age groups.

    • Focus on products that align with the unique financial circumstances and needs of each demographic.

  2. Educational Initiatives for Younger Clients:

    • Concentrate on educating younger age groups about credit-building, debt management, and early savings strategies.

    • Aim to cultivate long-term client loyalty from an early stage.

  3. Financial Planning for Mature Clients:

    • Shift focus towards long-term financial planning as clients age.

    • Provide detailed information on various investment options and retirement savings strategies.

    • Enhance clients' financial well-being and reinforce loyalty to our services.

  4. Incentives for Increased Savings:

    • Encourage clients, especially those with limited savings, to adopt more effective saving practices.

    • Offer incentives such as higher interest rates on savings accounts or rewards programs linked to saving milestones.

  5. Diversification of Loan Portfolios:

    • Address the current overemphasis on loans for small appliances and new cars.

    • Recommend diversifying risk across various purposes and industries to mitigate concentration risk.

    • Avert potential financial instability and economic vulnerability due to market downturns in specific sectors (e.g., small appliances, automotive).

  6. Data-Driven Insight:

    • Utilize radar graph analysis to identify and rectify imbalances in our loan distribution.

    • Ensure a balanced approach that aligns with our broader risk management strategy and market trends.

MACHINE LEARNING

Utilizing Python and Pandas for Predictive Credit Risk Modeling

In this project, I utilized Python and its powerful library, Pandas, to develop a predictive model for assessing credit risk. The process began with importing the training and test data sets using Pandas, followed by feature selection where variables like 'Loan Purpose', 'Gender', 'Marital Status', 'Housing', and 'Job' were identified as significant predictors. To handle categorical data, I employed the LabelEncoder from sklearn, ensuring that non-numerical features were appropriately transformed for analysis.

​

Initially, a Decision Tree Classifier was applied to train the model. This choice allowed me to understand the feature importance and the decision-making process of the model. However, after further analysis and considering the nature of the problem – a binary classification task – I opted for the Logistic Regression algorithm. This decision was informed by Logistic Regression's efficacy in handling binary outcomes and its ability to provide probabilities for each class, offering a more nuanced understanding of credit risk levels. Furthermore, Logistic Regression is known for its interpretability, especially in terms of understanding the impact of each feature on the prediction, which is crucial in credit risk assessment where explainability is as important as accuracy.

WeChat截图_20231122161055.png
White Structure
20231122-31347-PM.jpg
bottom of page