Regression

Create linear models

Sample Regression Model

Light gradient models are hard to beat with tabular data. Protip - Always do EDA first and determine if you need to do any feature engineering.

import lightgbm as lgb
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

# Load data
X, y = ... # Your data loading code here

# Split data into train and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Define hyperparameters
params = {
    'objective': 'regression',
    'metric': 'mse',
    'num_leaves': 31,
    'learning_rate': 0.05,
    'feature_fraction': 0.9,
    'bagging_fraction': 0.8,
    'bagging_freq': 5,
    'verbose': -1
}

# Create the lgbm datasets
train_data = lgb.Dataset(X_train, label=y_train)
val_data = lgb.Dataset(X_val, label=y_val)

# Train the model
model = lgb.train(params, train_data, valid_sets=[train_data, val_data], num_boost_round=10000, early_stopping_rounds=1000, verbose_eval=100)

# Make predictions on the validation set
y_pred = model.predict(X_val)

# Evaluate the model
mse = mean_squared_error(y_val, y_pred)
print(f"Validation MSE: {mse:.4f}")

reference material

  • https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-supervised-learning#linear-models