Analyzing and understanding the factors affecting loan status using data from Lending Club.
- This project aims to analyze loan data from Lending Club to understand the factors that influence the status of loans.
- Background: Lending Club is a peer-to-peer lending company that connects borrowers with investors. The dataset includes information on loans, borrower details, and payment status.
- Business Problem: The objective is to identify key variables that predict loan default, which can help in making informed lending decisions.
- Dataset: The dataset used in this study is the 'loan.csv' file from Lending Club, containing information such as loan amount, interest rate, annual income, and loan status.
- The dataset was loaded using pandas and inspected for missing values and data types.
- Columns with a high percentage of missing values (threshold set at 40%) were dropped.
- Unnecessary columns such as 'id' and 'member_id' were removed to streamline the analysis.
- Box plots examined the relationship between numerical variables and loan status.
- Count plots were utilized to explore the relationship between categorical variables and loan status.
- The correlation matrix of numerical variables was computed and visualized using a heatmap to identify significant correlations
- Conclusion 1: Applicants with shorter Loan Terms are more likely to default
- Conclusion 2: Applicants with Annual income of less than 120,000 are more likely to default
- Conclusion 3: Applicants with a DTI ratio of more than 10% have struggled to replay the loan most
- Conclusion 4: Verification Status has no impact on Defaulter rate, as verified applicants are the most who defaulted on repayment
- Conclusion 5: Interest Rate has a positive correlation with loan defaults. Applicants are more likely to default with higher interest rate.
- Conclusion 6: Grades have a positive correlation with defaulter percentage. Applicants with lower grades are more likely to default.
- Conclusion 7: Applicants with a Rented home are slightly more likely to default than applicants with Mortgaged or Own homes.
- pandas - version 1.x
- matplotlib - version 3.x
- seaborn - version 0.x
- This project was inspired by the need to understand loan default risk for better lending decisions.
Created by [@ManisCodeBase] - feel free to contact me!