L-infinity histogram gap
- humancompatible.detect.methods.l_inf.l_inf.check_l_inf_gap(X: ndarray, y: ndarray, binarizer: Binarizer, feature_involved: str, subgroup_to_check: Any, delta: float, verbose: int = 1) bool[source]
Test whether a protected subgroup’s outcome distribution differs from the overall population by at most delta in the l_inf-norm.
- Parameters:
X (np.ndarray) – Protected-attribute slice of the dataset (same rows as y).
y (np.ndarray) – Boolean target vector.
binarizer (Binarizer) – The binarizer used to encode X and y.
feature_involved (str) – Name of the protected column whose subgroup is tested.
subgroup_to_check (Any) – Raw value of the subgroup to isolate.
delta (float) – Threshold for the L-infinity norm.
verbose (int, default 1) – Verbosity level. 0 = silent, 1 = logger output only, 2 = all detailed logs (including solver output).
- Returns:
True if the subgroup histogram is within delta; False otherwise.
- Return type:
bool
- Raises:
ValueError – If delta is not positive.
KeyError – If feature_involved is not in the binarizer’s feature names.
KeyError – If subgroup_to_check is not a valid value for the feature.
- humancompatible.detect.methods.l_inf.lp_tools.lin_prog_feas(hist1: ndarray, hist2: ndarray, delta: float, num_samples: float = 1.0) int[source]
Specifies a number of samples as a fraction of the total histogram bins and checks whether all the sampled bins satisfy
|hist1 - hist2| <= delta
- Parameters:
hist1 (np.ndarray) – 1-D array (or (n,1) column vector) of histogram bin densities for the full dataset.
hist2 (np.ndarray) – 1-D array (or (n,1) column vector) of histogram bin densities for the subgroup.
delta (float) – Threshold for the absolute difference |hist1 - hist2|.
num_samples (float) – Fraction of total bins to sample. The function draws int(num_samples * (len(hist1) - 1)) random samples.
- Returns:
- Status code from scipy.optimize.linprog. A status of 0 indicates
the constraints are feasible (i.e., |hist1 - hist2| <= delta for all sampled bins); other codes signal infeasibility or solver errors.
- Return type:
int