Regression
Create linear models
On this page
Sample Regression Model
Light gradient models are hard to beat with tabular data. Protip - Always do EDA first and determine if you need to do any feature engineering.
import lightgbm as lgb
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
# Load data
X, y = ... # Your data loading code here
# Split data into train and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
# Define hyperparameters
params = {
'objective': 'regression',
'metric': 'mse',
'num_leaves': 31,
'learning_rate': 0.05,
'feature_fraction': 0.9,
'bagging_fraction': 0.8,
'bagging_freq': 5,
'verbose': -1
}
# Create the lgbm datasets
train_data = lgb.Dataset(X_train, label=y_train)
val_data = lgb.Dataset(X_val, label=y_val)
# Train the model
model = lgb.train(params, train_data, valid_sets=[train_data, val_data], num_boost_round=10000, early_stopping_rounds=1000, verbose_eval=100)
# Make predictions on the validation set
y_pred = model.predict(X_val)
# Evaluate the model
mse = mean_squared_error(y_val, y_pred)
print(f"Validation MSE: {mse:.4f}")
reference material
- https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-supervised-learning#linear-models