This is the second part of the rent data analysis project. In this section, I look at the trends in the data collected. First, I look for any outliers in the data and make the necessary adjustments.
Looking at following graphs, we can observe some general trends about the data:
Let’s take a closer look at the 5-bedroom apartments in order to drop or correct the outliers. Using the summary statistics below we can see that a 5-bedroom apartment in the 75th percentile costs Ksh 300,000 ($3000).
| stat | value |
|---------|--------------|
| mean | 1.013779e+06 |
| std | 1.425785e+07 |
| min | 6.500000e+03 |
| 25% | 1.000000e+05 |
| 50% | 2.000000e+05 |
| 75% | 3.000000e+05 |
| max | 2.800000e+08 |
We can drop any apartment that costs more than $5000 and this gives us a reasonable distribution as shown below
The next step is to clean up data for 1 & 8-bedroom apartments. From the graphs above we can see that:
This is an indication of some anormaly in the data. Looking at the data closely, some of the errors are as a result of poor labeling or incorrectly extracting the number of bedrooms from their titles or descriptions.
Some of the 8-bedroom apartments for instance were actually studio apartments that had the digit ‘8’ in their addresses. By replacing them with the correct values we can see better linear relationship between prices and bedrooms
We can now plot the listings’ distribution in order to answer some of the questions such as which apartment sizes are the most common?
Here are some of the observations:
Let’s now take a look at how these prices compare across different towns.
As shown above:
Zooming in on Kilifi, Mombasa and Nairobi towns, we can see that Kilifi has the lowest rent prices followed by Mombasa.