Authors: Sumanth Mysore, Abhinay Muthineni, Vaishnavi Nandikandi, Sudersan Behera
Certificate: View Certificate
Economy of the country is greatly driven by the prices of houses in that country. Both buyers and sellers depend on the pricing strategies. Ask an emptor to explain the factors they think are considered for pricing the house at that price and that they probably start with railways and end with various attributes. Over here it proves that more factors will be applied on the pricing strategies of the house. The aim of the project is to predict the house prices with various regression models. Nowadays Machine Learning is a booming technology. Data is the heart of Machine Learning. AI and Machine Learning holds the key position in the technological market. All industries are moving towards automation. So we have considered ML as a main predicting subject in our project and worked using it. These days everything fluctuates. Starting with crypto and various business models varies day by day which includes real estate as well so in this project house prediction depends on real estate data and ML techniques. Many people want to buy a good house within the budget. But the disadvantage is that the present system doesn’t calculate the house predictions so well and end up in loss of money. So, the goal of our project is to reduce money loss and buy good house. Many factors are there to be considered in order to predict the house price which includes budget factors and fewer house modifications according to the buyer. So, we are considering all of those factors and predicted using various machine learning techniques like SVR, KNN, SGB regression, CatBoost regression, Random forest regression
As Artificial Intelligence is involving everywhere in the world there is stupendous amount of increase in technology in our day-to-day life and implementation of various advanced machines has been increased. As growth of innovations to business is going upward computer sciences tend to increase technological transformations. This can put out the vulnerability of security and increase protection of the data. By considering various machine learning models and using the data of real estate forms in Boston we predict the house prices in entire Boston. This project is all about predicting the house prices by considering the datasets of Boston real estate by using different class labels. As we need the data to predict house price, the supervised data is produced which plays key role in predicting the house price and help in dealing with the real estate entities. As we are using machine learning it is easier to achieve the target like higher intelligent predictions which are a benefit factor for futuristic projects and intelligent systems which are linked to robotics as well. Now a days, smartphones are super-advanced and handy devices which could be used for almost every daily tasks instead of laptops. Smartphones applications are widely available, popular and are easily adopted. And so, we developed an Android app which displays the real-time COVID19 data across the globe, through which every user will know about the situation going around the world regarding the COVID19 and thereby they will be able to stay updated and safe.
Main methodology of machine learning is constructing the models using past data as a source to predict the new data. As population is increasing rapidly the market demand is also increasing at the same pace. Most of the public are vacating the rural areas because of scarcity of jobs and increment of unemployment. This ultimately results in increment of houses in cities. If they don’t have enough idea about prices then it results in loss of money.
In this project, we have used many algorithms of machine learning such as Linear Regression, Random Forest Regressor, CatBoost Regressor, SVR, KNN, XGB Regressor, AdaBoost Regressor to predict house prices. 80% of information from the known datasets is employed for training purposes and remaining 20% of the information is utilized for testing purposes. This work involves several techniques such as transformation techniques, reduction techniques and searching for new correlations.
We have lot to research in house price prediction and knowledge of machine learning is required. In general house prices are made considering various variables. They call these factors to be concept, strength and placement. Even we consider physical conditions that includes no. of rooms, dimensions of the property, age of the property, garage and kitchen scaling.
During this project, we have used many machine learning regression algorithms such as Linear regression, Decision Tree, K-Means, and Random Forest. Various factors are there to affect house prices which consists of physical attributes, location, and economic factors. We take RMSE under consideration because the performance matrix for various datasets and these algorithms are applied and determine the most accurate model which predicts better results.
III. PROPOSED SYSTEM
In the proposed system data is passed through three data pre-processing stages and later working on machine learning algorithms. Cleaning the dataset is the first step where data it removes all outliers in the system and other false data nodes like checking for the null values, checking if the data-id distributed normally, checking the correlation between attributes and the three data pre-processing methods are Robust Scaler, Quantile Transformer, Yeo-Johnson Transformer, and the next steps involve working of models like Linear Regression, KNN, SVR, Random Forest Regressor, AdaBoost Regressor, XGB Regressor, Cat-Boost Regressor where we perform different operations to achieve accuracy. Then the Root Mean Square Error (RMSE) is calculated while working with each of the machine learning models explored and this RMSE is used as a performance evaluation metric.
A. Exploring the Data
All the tuples of the dataset defines of the Boston suburb or city. That data was collected from Boston SMSA ( Standard Metropolitan Statistical Area) at the time of 1970s.
Those attributes are described below which were taken from UCI MLR1.CRIM: per capita crime rate by town
We can clearly notice that our attributes are having a fusion of many units.
B. Data Pre-Processing
a. Positive Correlation: That means if a feature X decreases, then the feature Y also decreases or if feature Y increases, then feature X also increases. Both features move in correspondence and there is a linear relationship between them.
b. Negative Correlation: implies that if feature X decreases, then feature Y must increase and the vice versa.
c. No Correlation: There is no link between those 2 attributes.
C. Pre-processing Methods
D. Comparison of 3 transformation techniques
After comparison of 3 transformation techniques skew distribution data. We see that the Yeo-Johnson transformer yields skew distribution is close to 0. With this we can confirm that the Yeo-Johnson transformer optimal solution amongst the three transformation techniques and the data transformed through this transformer drives to better predictions.
E. Exploring various ML models:
We observe that Support vector Regression technique gives us highest accuracy of more than 89% whereas CatBoost algorithm yields accuracy more than 88% which gives almost same accuracy as Support Vector Regression.
We convey our sincere thanks to all the faculties of ECM department, Sreenidhi Institute of Science and Technology, for their continuous help, co-operation, and support to complete this project.
We are very thankful to Dr. D. Mohan, Head of ECM Department, Sreenidhi Institute of Science and Technology, Ghatkesar for providing an initiative to this project and giving valuable timely suggestions over our project and for their kind cooperation in the completion of the project.
We convey our sincere thanks to Dr.T.Ch. Siva Reddy, Principal, and Chakkalakal Tommy, Executive Director, Sreenidhi Institute of Science and Technology, Ghatkesar for providing resources to complete this project. Finally, we extend our sense of gratitude to almighty, our parents, all our friends, teaching and non- teaching staff, who directly or indirectly helped us in this endeavor
We convey our sincere thanks to all the faculties of ECM department, Sreenidhi Institute of Science and Technology, for their continuous help, co-operation, and support to complete this project. We are very thankful to Dr. D. Mohan, Head of ECM Department, Sreenidhi Institute of Science and Technology, Ghatkesar for providing an initiative to this project and giving valuable timely suggestions over our project and for their kind cooperation in the completion of the project. We convey our sincere thanks to Dr.T.Ch. Siva Reddy, Principal, and Chakkalakal Tommy, Executive Director, Sreenidhi Institute of Science and Technology, Ghatkesar for providing resources to complete this project. Finally, we extend our sense of gratitude to almighty, our parents, all our friends, teaching and non- teaching staff, who directly or indirectly helped us in this endeavor
 Garriga C., Hedlund A., Tang Y., Wang P, “Regional Science and Urban Economics Rural-urban migration and house prices in China”, Regional Science and Urban Economics (2020), p. 103613, March 2020.  Wang X., Li K., Wu J. “House price index based on online listing information?: The case of China” Journal of Housing Economics, 50 (2020), p. 101715, May 2018.  G.Naga Satish, Ch.V.Raghavendran, M.D.Sugnana Rao, Ch.Srinivasulu “House Price Prediction Using Machine Learning”. IJITEE, 2019.  Bharatiya, Dinesh, et al. “Stock market prediction using linear regression.” Electronics, Communication, and Aerospace Technology (ICECA), 2017 International conference of. Vol. 2. IEEE, 2017.  Anand G. Rawool1 , Dattatray V. Rogye , Sainath G. Rane , Dr. Vinayk A. Bharadi, “House price predition using Machine Learning, IRE Journals, May 2021.  E.Laxmi Lydia, Gogineni Hima Bindu, Aswadhati Sirisham, Pasam Prudhvi Kiran, “Electronic Governance of Housing Price using Boston Dataset Implementing through Deep Learning Mechanism”, IJRTE, Volume-7 Issue-682, April-2019.  Li Yu, Chenlu Jiao, Hongrun Xin, Yan Wang, Kaiyang Wang, “ Prediction on Housing Price Based on Deep Learning”, World Academy of Science, Engineering and Technology International Journal of Computer and Information Engineering Vol.12, No.:2, 2018.
Copyright © 2022 Sumanth Mysore, Abhinay Muthineni, Vaishnavi Nandikandi, Sudersan Behera. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.