1 #include "caffe2/operators/bisect_percentile_op.h" 5 REGISTER_CPU_OPERATOR(BisectPercentile, BisectPercentileOp<CPUContext>);
6 OPERATOR_SCHEMA(BisectPercentile)
10 This operator is to map raw feature values into the percentile 11 representations based on Bisection for more than one feature. 13 The input is the bath of input feature values, with the size of (batch_size, 14 num_feature), where num_feature = F (F >= 1). 16 For each feature, we also need additional information regarding the feature 18 There are several vectors to keep data to percentile mappping information 19 as arguments (context): 20 1. feature raw values (R) 21 2. feature percentile mapping (P) 22 3. feature percentile lower bound (L) 23 4. feature percentile upper bound (U) 26 Suppose the sampled data distribution is as follows: 27 1, 1, 2, 2, 2, 2, 2, 2, 3, 4 28 We have the mapping vectors as follows: 30 P = [0.15, 0.55, 0.9, 1.0] 31 L = [0.1, 0.3, 0.9, 1.0] 32 U = [0.2, 0.8, 0.9, 1.0] 33 Where P is computed as (L + U) / 2. 35 For a given list of feature values, X = [x_0, x_1, ..., x_i, ...], for each 36 feature value (x_i) we first apply bisection to find the right index (t), 37 such that R[t] <= x_i < R[t+1]. 38 If x_i = R[t], P[t] is returned; 39 otherwise, the interpolation is apply by (R[t], R[t+1]) and (U[t] and L[t]). 41 As there are F features (F >= 1), we concate all the R_f, P_f, L_f, and 42 U_f for each feature f and use an additional input length to keep track of 43 the number of points for each set of raw feature value to percentile mapping. 44 For example, there are two features: 47 We will build R = [0.1, 0.4, 0.5, 0.3, 1.2]; besides, we have 49 to indicate the boundries of the percentile information. 54 "1D tensor, which is the concatenation of all sorted raw feature " 55 "values for all features.")
58 "1D tensor. There is one-one mapping between percentile_mapping and " 59 "percentile_raw such that each element in percentile_mapping " 60 "corresponds to the percentile value of the corresponding raw feature " 64 "1D tensor. There is one-one mapping between percentile_upper and " 65 "percentile_raw such that each element in percentile_mapping " 66 "corresponds to the percentile lower bound of the corresponding raw " 70 "1D tensor. There is one-one mapping between percentile_upper and " 71 "percentile_raw such that each element in percentile_mapping " 72 "corresponds to the percentile upper bound of the corresponding raw " 76 "1D tensor. There is one-one mapping between percentile_upper and " 77 "percentile_raw such that each element in percentile_mapping " 78 "corresponds to the percentile upper bound of the corresponding raw " 83 "Input 2D tensor of floats of size (N, D), where N is the batch size " 84 "and D is the feature dimension.")
88 "2D tensor of output with the same dimensions as the input raw_values.");
90 NO_GRADIENT(BisectPercentile);
A global dictionary that holds information about what Caffe2 modules have been loaded in the current ...