Random forest regressor sklearn Implementation is possible with RandomForestRegressor class in sklearn.ensemble package in few lines of code. There are various hyperparameter in RandomForestRegressor class but their default values like n_estimators=100, *, criterion=‘mse’, max_depth=None, min_samples_split=2 etc. We can choose their optimal values using some hyperparametric tuning techniques like GridSearchCV and RandomSearchCV.
Most Importantly, In this article, we will demonstrate you to end to end implementation of Random forest regressor sklearn.
Random forest regressor sklearn : Implementation ( Stepwise ) –
Firstly you will package using the import statement. Secondly, We will create the object of the Random forest regressor. After it, We will fit the data into the object. And here we go, The final prediction.
Step 1: Import the Package
from sklearn.ensemble import RandomForestRegressor
Step 2: Data Import –
Obviously, We are doing the regression hence we need some data. Here we are using the sklearn.datasets for demonstration. You may use your own data in the place of that. Let’s see the code.
from sklearn.datasets import make_regression X, y = make_regression(n_features=4, n_informative=2, random_state=0, shuffle=False)
It will create X and y as Input features and output variables. These are NumPy arrays. The key objective of this step to load the user data into the similar format of X, y.
Step 3: Model Creation –
In this step, We will create the model from RandomForestRegressor class. We first create the object and fit the data. Here is the code for that.
regr_obj = RandomForestRegressor(max_depth=3, random_state=0) regr_obj.fit(X, y)
Here we have used the parameters max_depth and random_state. We can configure the other parameter as per the user requirement and available data.
Here is the complete list of hyperparameters in random Regressor.
class sklearn.ensemble.RandomForestRegressor(n_estimators=100, *, criterion='mse', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, ccp_alpha=0.0, max_samples=None)
In order to predict the values, we can use regr_obj.predict() with the input array.
Here is the complete code with the output. Just copy the code and paste it into your IDEs or editor. You will get the output as below.
from sklearn.ensemble import RandomForestRegressor from sklearn.datasets import make_regression X, y = make_regression(n_features=4, n_informative=2,random_state=0, shuffle=False) regr_obj = RandomForestRegressor(max_depth=3, random_state=0) regr_obj.fit(X, y) print(regr_obj.predict([[2, 10, 30, 0]]))
In conclusion, I hope now you may easily create a Regression model using a Random forest. If you have any doubt, please comment below in the comment box.
Data Science Learner Team
Join our list
Subscribe to our mailing list and get interesting stuff and updates to your email inbox.