Title: | A Linear Model to 'SQL' Compiler |
---|---|
Description: | This is a cross-platform linear model to 'SQL' compiler. It generates 'SQL' from linear and generalized linear models. Its interface consists of a single function, modelc(), which takes the output of lm() or glm() functions (or any object which has the same signature) and outputs a 'SQL' character vector representing the predictions on the scale of the response variable as described in Dunn & Smith (2018) <doi:10.1007/978-1-4419-0118-7> and originating in Nelder & Wedderburn (1972) <doi:10.2307/2344614>. The resultant 'SQL' can be included in a 'SELECT' statement and returns output similar to that of the glm.predict() or lm.predict() predictions, assuming numeric types are represented in the database using sufficient precision. Currently log and identity link functions are supported. |
Authors: | Hugo Saavedra [aut, cre] |
Maintainer: | Hugo Saavedra <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0.0 |
Built: | 2025-02-12 04:10:27 UTC |
Source: | https://github.com/sparkfish/modelc |
Wrap the model SQL in the appropriate link function inverse to return scaled predictions
apply_linkinverse(model, sql)
apply_linkinverse(model, sql)
model |
A list with the same signature as the output of |
sql |
A character string representing the SQL to be wrapped in the link inverse |
A character string representing a SQL model formula
Get SQL representing a continuous term in the model with no interactions
build_additive_term(model, additive_term, first = FALSE)
build_additive_term(model, additive_term, first = FALSE)
model |
A list with the same signature as the output of |
additive_term |
A parameter name. |
first |
A logical flag signaling whether the term is the first term in the formula |
A SQL character string representing an additive term
Build SQL CASE statements representing the factors in the model
build_factor_case_statements(model, first = FALSE)
build_factor_case_statements(model, first = FALSE)
model |
A list with the same signature as the output of |
first |
A logical flag signaling whether the term is the first term in the formula |
A character string representing a SQL CASE statement
Build a SQL interaction term
build_interaction_term(model, interaction_term, first = FALSE)
build_interaction_term(model, interaction_term, first = FALSE)
model |
A list with the same signature as the output of |
interaction_term |
The raw interaction term (a character string) from the R model |
first |
A logical flag signaling whether the term is the first term in the formula |
A character string representing a SQL interaction term
Get SQL representing the intercept term given the R model and parameter name
build_intercept(model, parameter, first = FALSE)
build_intercept(model, parameter, first = FALSE)
model |
A list with the same signature as the output of |
parameter |
A parameter name. |
first |
A logical flag signaling whether the term is the first term in the formula |
A SQL character string representing the intercept term in the model
Build a SQL product
build_product(lhs, rhs)
build_product(lhs, rhs)
lhs |
A character string representing the left hand side of the multiplication |
rhs |
A character string representing the right hand side of the multiplication |
A character string representing a valid SQL product term
Extract the level from the factor name
extract_level(parameter, factor)
extract_level(parameter, factor)
parameter |
A parameter name |
factor |
A factor term |
A SQL string literal representing the factor level
Extract the coefficient of a model parameter
extract_parameter_coefficient(model, parameter)
extract_parameter_coefficient(model, parameter)
model |
A list with the same signature as the output of |
parameter |
A character string corresponding to a model predictor |
A double corresponding to the coefficient, or 0 if the coefficient is missing
Extract parameters from a linear model
extract_parameters(model)
extract_parameters(model)
model |
A list with the same signature as the output of |
A character vector of terms from a linear model
Extract the factor name from an R model
get_factor_name(parameter, model)
get_factor_name(parameter, model)
parameter |
A parameter name. |
model |
A list with the same signature as the output of |
A character string representing the factor name
Check if an R model contains a coefficient
has_parameter(model, parameter)
has_parameter(model, parameter)
model |
A list with the same signature as the output of |
parameter |
A parameter name |
A logical representing whether a coefficient is present in the model
Detect if the given model term is a factor
is_factor(parameter, model)
is_factor(parameter, model)
parameter |
A parameter name. |
model |
A list with the same signature as the output of |
A logical representing whether or not the term is a factor
Detect if the given model term is an interaction
is_interaction(parameter)
is_interaction(parameter)
parameter |
A parameter name. |
A logical representing whether or not the term is an interaction
Check if the given parameter is the intercept
is_intercept(parameter)
is_intercept(parameter)
parameter |
A parameter name. |
A logical representing whether the given parameter is the intercept
Compile an R model to a valid TSQL formula
modelc(model, modify_scipen = TRUE)
modelc(model, modify_scipen = TRUE)
model |
A list with the same signature as the output of |
modify_scipen |
A boolean indicating whether to modify the "scipen" option to avoid generating invalid SQL |
A character string representing a SQL model formula
a <- 1:10 b <- 2*1:10 c <- as.factor(a) df <- data.frame(a, b, c) formula = b ~ a + c # A vanilla linear model linear_model <- lm(formula, data = df) modelc::modelc(linear_model) # A generalized linear model with gamma family distribution and log link function gamma_loglink_model <- glm(formula, data = df, family=Gamma(link="log")) modelc::modelc(gamma_loglink_model) # A generalized linear model with gamma family distribution and identity link function gamma_idlink_model <- glm(formula, data = df, family=Gamma(link="identity")) modelc::modelc(gamma_idlink_model)
a <- 1:10 b <- 2*1:10 c <- as.factor(a) df <- data.frame(a, b, c) formula = b ~ a + c # A vanilla linear model linear_model <- lm(formula, data = df) modelc::modelc(linear_model) # A generalized linear model with gamma family distribution and log link function gamma_loglink_model <- glm(formula, data = df, family=Gamma(link="log")) modelc::modelc(gamma_loglink_model) # A generalized linear model with gamma family distribution and identity link function gamma_idlink_model <- glm(formula, data = df, family=Gamma(link="identity")) modelc::modelc(gamma_idlink_model)