Tested model: ============= ------- notif_encoder: NotifEncoder( (model): Sequential( (0): Linear(in_features=29, out_features=64, bias=True) (1): ReLU() (2): Linear(in_features=64, out_features=32, bias=True) (3): Dropout(p=0.2, inplace=False) ) ) ------- evnt_encoder: EvntEncoder( (model): Sequential( (0): Linear(in_features=11, out_features=64, bias=True) (1): ReLU() (2): Linear(in_features=64, out_features=32, bias=True) (3): Dropout(p=0.2, inplace=False) ) ) ------- embedder: AttentionEmbedder( (adapter): Linear(in_features=32, out_features=32, bias=True) (representer): AttentionRepresenter( (models): ParameterList( (0): Object of type: AttentionRepresenterBlock (1): Object of type: AttentionRepresenterBlock (0): AttentionRepresenterBlock( (first_part_norm): BatchNorm1d(22, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (first_part_Q): Linear(in_features=32, out_features=160, bias=True) (first_part_K): Linear(in_features=32, out_features=160, bias=True) (first_part_V): Linear(in_features=32, out_features=160, bias=True) (first_part_att): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=160, out_features=160, bias=True) ) (first_part_final): Sequential( (0): Linear(in_features=160, out_features=32, bias=True) (1): Dropout(p=0.2, inplace=False) ) (second_part_norm): BatchNorm1d(22, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (second_part_mlp): Sequential( (0): Linear(in_features=32, out_features=64, bias=True) (1): GELU(approximate='none') (2): Dropout(p=0.2, inplace=False) (3): Linear(in_features=64, out_features=32, bias=True) (4): GELU(approximate='none') (5): Dropout(p=0.2, inplace=False) ) ) (1): AttentionRepresenterBlock( (first_part_norm): BatchNorm1d(22, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (first_part_Q): Linear(in_features=32, out_features=160, bias=True) (first_part_K): Linear(in_features=32, out_features=160, bias=True) (first_part_V): Linear(in_features=32, out_features=160, bias=True) (first_part_att): MultiheadAttention( (out_proj): NonDynamicallyQuantizableLinear(in_features=160, out_features=160, bias=True) ) (first_part_final): Sequential( (0): Linear(in_features=160, out_features=32, bias=True) (1): Dropout(p=0.2, inplace=False) ) (second_part_norm): BatchNorm1d(22, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True) (second_part_mlp): Sequential( (0): Linear(in_features=32, out_features=64, bias=True) (1): GELU(approximate='none') (2): Dropout(p=0.2, inplace=False) (3): Linear(in_features=64, out_features=32, bias=True) (4): GELU(approximate='none') (5): Dropout(p=0.2, inplace=False) ) ) ) ) (mlp): Sequential( (0): Linear(in_features=64, out_features=64, bias=True) (1): ReLU() (2): Dropout(p=0.2, inplace=False) (3): Linear(in_features=64, out_features=32, bias=True) ) ) (Training dataset class occurences: {3: 176, 1: 46, 4: 45, 2: 1}.) svm finished: True; SV cnt 74; Trained SVM with LINEAR kernel with loss 720449.2712342693 and 74 outliers. svm finished: False; SV cnt 40; Trained SVM with POLYNOMIAL (deg 2) kernel with loss 400000.0 and 40 outliers. svm finished: True; SV cnt 74; Trained SVM with GAUSSIAN (sigma 1) kernel with loss 400000.0 and 40 outliers. Results of classification of specially annotated notifications: IDs of clssified notificaions: [11168292, 11169523, 11173814, 9961382, 9962202, 9971171, 10168557, 10213020, 10275769] classes of classified notifications: [3, 3, 1, 3, 3, 1, 3, 2, 3] SVM with LINEAR kernel: tensor([-1., -1., -1., -1., -1., -1., -1., -1., -1.], dtype=torch.float64) SVM with POLYNOMIAL kernel: tensor([0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=torch.float64) SVM with GAUSSIAN kernel: tensor([-1., -1., -1., -1., -1., -1., -1., -1., -1.], dtype=torch.float64) logistic regression: tensor([[1.0000e+00, 8.7569e-14], [1.0000e+00, 8.7569e-14], [1.0000e+00, 8.7569e-14], [1.0000e+00, 8.7569e-14], [1.0000e+00, 8.7569e-14], [1.0000e+00, 8.7569e-14], [1.0000e+00, 8.7572e-14], [1.0000e+00, 8.7572e-14], [1.0000e+00, 8.7572e-14]], grad_fn=) MLP classifier: tensor([[1.0000e+00, 2.7591e-16], [1.0000e+00, 2.7591e-16], [1.0000e+00, 2.7591e-16], [1.0000e+00, 2.7591e-16], [1.0000e+00, 2.7591e-16], [1.0000e+00, 2.7591e-16], [1.0000e+00, 2.7592e-16], [1.0000e+00, 2.7592e-16], [1.0000e+00, 2.7592e-16]], grad_fn=) Results on test set: SVM with LINEAR kernel: 0.0 (0/13) SVM with POLYNOMIAL kernel: 0.0 (0/13) SVM with GAUSSIAN kernel: 0.0 (0/13) logistic regression: 1.0 (13/13) MLP classifier: 1.0 (13/13) (class occurence counts: {3: 11, 4: 2})