KerasGRU4Rec#

  • Out-of-the box:

    The implementation as is.

  • Minor fix:
    1. Hard-coded parameters: hidden_size, dropout_p_hidden, learning_rate now can be set.

    2. The default optimizer is changed to Adagrad.

  • Major fix:
    1. Fixed incorrect resetting of hidden states (the same error that GRU4REC-pytorch has)

    2. Epochs don’t end now when the number of remaining sessions is not enough to fully fill the mini-batch.

Rees46#

Note

BPR-Max is not supported by KerasGRU4Rec

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1108

0.1108

0.3000

0.1772

0.4126

0.1922

0.5291

0.2003

GRU4Rec Official

KerasGRU4Rec params

0.0918

0.0918

0.2639

0.1518

0.3713

0.1660

0.4860

0.1740

GRU4Rec Official

KerasGRU4Rec params

0.1097

0.1097

0.2946

0.1746

0.4029

0.1890

0.5152

0.1969

KerasGRU4Rec

OOB

0.0805

0.0805

0.2394

0.1354

0.3444

0.1494

0.4608

0.1574

KerasGRU4Rec

Correct exp

0.1027

0.1027

0.2799

0.1648

0.3864

0.1790

0.4979

0.1868

KerasGRU4Rec

Correct full

0.1027

0.1027

0.2796

0.1647

0.3859

0.1788

0.4978

0.1866

Note

BPR-Max is not supported by KerasGRU4Rec

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

KerasGRU4Rec params

-17.15%

-17.15%

-12.03%

-14.35%

-10.03%

-13.61%

-8.15%

-13.13%

GRU4Rec Official

KerasGRU4Rec params

-1.07%

-1.07%

-1.79%

-1.45%

-2.35%

-1.63%

-2.64%

-1.72%

KerasGRU4Rec

OOB

-27.35%

-27.35%

-20.21%

-23.56%

-16.54%

-22.28%

-12.91%

-21.39%

KerasGRU4Rec

Correct exp

-7.32%

-7.32%

-6.70%

-6.96%

-6.35%

-6.85%

-5.89%

-6.75%

KerasGRU4Rec

Correct full

-7.34%

-7.34%

-6.80%

-7.06%

-6.48%

-6.96%

-5.92%

-6.84%

Note

BPR-Max is not supported by KerasGRU4Rec

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4Rec Official

KerasGRU4Rec

KerasGRU4Rec

KerasGRU4Rec

Variant

Best params

KerasGRU4Rec params

KerasGRU4Rec params

OOB

Correct exp

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adam

adagrad

adagrad

constrained_embedding

True

False

False

False

False

False

embedding

0

0

0

0

0

0

final_act

softmax

softmax

softmax

softmax

softmax

softmax

layers

512

512

512

100

512

512

batch_size

240

240

240

240

240

240

dropout_p_embed

0.45

0

0

N/A

N/A

N/A

dropout_p_hidden

0

0

0

0.25

0

0

learning_rate

0.065

0.065

0.065

0.001

0.065

0.065

momentum

0

0

0

N/A

N/A

N/A

n_sample

2048

2048

ALL

N/A

N/A

N/A

sample_alpha

0.5

0

N/A

N/A

N/A

N/A

bpreg

0

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

N/A

Note

BPR-Max is not supported by KerasGRU4Rec

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

367.41

650.91

156189.00

GRU4Rec Official

KerasGRU4Rec params

306.43

0.83 x

780.44

187270.00

GRU4Rec Official

KerasGRU4Rec params

35765.38

97.34 x

116.72 x

6.69

1604.00

KerasGRU4Rec

OOB

123265.62

335.50 x

402.26 x

1.94

465.52

KerasGRU4Rec

Correct exp

130917.81

356.33 x

427.24 x

1.83

438.31

KerasGRU4Rec

Correct full

130987.98

356.52 x

427.46 x

1.83

438.08

Yoochoose#

Note

BPR-Max is not supported by KerasGRU4Rec

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1797

0.1797

0.4457

0.2757

0.5698

0.2924

0.6804

0.3002

GRU4Rec Official

KerasGRU4Rec params

0.1734

0.1734

0.4351

0.2671

0.5612

0.2841

0.6722

0.2919

GRU4Rec Official

KerasGRU4Rec params

0.1851

0.1851

0.4490

0.2802

0.5724

0.2968

0.6809

0.3044

KerasGRU4Rec

OOB

0.1576

0.1576

0.3915

0.2410

0.5195

0.2581

0.6392

0.2666

KerasGRU4Rec

Correct exp

0.1815

0.1815

0.4444

0.2766

0.5694

0.2934

0.6784

0.3010

KerasGRU4Rec

Correct full

0.1824

0.1824

0.4446

0.2771

0.5678

0.2936

0.6768

0.3013

Note

BPR-Max is not supported by KerasGRU4Rec

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

KerasGRU4Rec params

-3.47%

-3.47%

-2.40%

-3.12%

-1.50%

-2.84%

-1.21%

-2.76%

GRU4Rec Official

KerasGRU4Rec params

3.00%

3.00%

0.72%

1.61%

0.45%

1.49%

0.08%

1.39%

KerasGRU4Rec

OOB

-12.28%

-12.28%

-12.17%

-12.58%

-8.84%

-11.73%

-6.04%

-11.20%

KerasGRU4Rec

Correct exp

1.03%

1.03%

-0.29%

0.32%

-0.07%

0.35%

-0.28%

0.29%

KerasGRU4Rec

Correct full

1.52%

1.52%

-0.27%

0.48%

-0.35%

0.42%

-0.52%

0.37%

Note

BPR-Max is not supported by KerasGRU4Rec

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4Rec Official

KerasGRU4Rec

KerasGRU4Rec

KerasGRU4Rec

Variant

Best params

KerasGRU4Rec params

KerasGRU4Rec params

OOB

Correct exp

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adam

adagrad

adagrad

constrained_embedding

True

False

False

False

False

False

embedding

0

0

0

0

0

0

final_act

softmax

softmax

softmax

softmax

softmax

softmax

layers

480

480

480

100

480

480

batch_size

48

48

48

48

48

48

dropout_p_embed

0

0

0

N/A

N/A

N/A

dropout_p_hidden

0.2

0.2

0.2

0.25

0.2

0.2

learning_rate

0.07

0.07

0.07

0.001

0.07

0.07

momentum

0

0

0

N/A

N/A

N/A

n_sample

2048

2048

ALL

N/A

N/A

N/A

sample_alpha

0.2

0

N/A

N/A

N/A

N/A

bpreg

0

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

N/A

Note

BPR-Max is not supported by KerasGRU4Rec

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

451.75

991.99

47613.00

GRU4Rec Official

KerasGRU4Rec params

390.87

0.87 x

1146.51

55030.00

GRU4Rec Official

KerasGRU4Rec params

2800.25

6.20 x

7.16 x

160.03

7681.00

KerasGRU4Rec

OOB

13740.04

30.42 x

35.15 x

32.61

1565.46

KerasGRU4Rec

Correct exp

16610.50

36.77 x

42.50 x

26.98

1294.93

KerasGRU4Rec

Correct full

16666.31

36.89 x

42.64 x

26.89

1290.60

Coveo#

Note

BPR-Max is not supported by KerasGRU4Rec

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.0489

0.0489

0.1418

0.0814

0.2085

0.0901

0.2947

0.0960

GRU4Rec Official

KerasGRU4Rec params

0.0466

0.0466

0.1360

0.0775

0.1993

0.0859

0.2852

0.0917

GRU4Rec Official

KerasGRU4Rec params

0.0480

0.0480

0.1383

0.0789

0.2029

0.0875

0.2880

0.0933

KerasGRU4Rec

OOB

0.0451

0.0451

0.1270

0.0732

0.1888

0.0813

0.2739

0.0871

KerasGRU4Rec

Correct exp

0.0477

0.0477

0.1370

0.0786

0.1997

0.0868

0.2822

0.0925

KerasGRU4Rec

Correct full

0.0469

0.0469

0.1361

0.0778

0.1993

0.0861

0.2826

0.0918

Note

BPR-Max is not supported by KerasGRU4Rec

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

KerasGRU4Rec params

-4.62%

-4.62%

-4.11%

-4.76%

-4.40%

-4.71%

-3.22%

-4.46%

GRU4Rec Official

KerasGRU4Rec params

-1.87%

-1.87%

-2.47%

-2.96%

-2.68%

-2.95%

-2.27%

-2.82%

KerasGRU4Rec

OOB

-7.72%

-7.72%

-10.47%

-10.03%

-9.44%

-9.81%

-7.06%

-9.24%

KerasGRU4Rec

Correct exp

-2.40%

-2.40%

-3.42%

-3.42%

-4.22%

-3.65%

-4.25%

-3.65%

KerasGRU4Rec

Correct full

-4.05%

-4.05%

-3.99%

-4.41%

-4.42%

-4.47%

-4.10%

-4.36%

Note

BPR-Max is not supported by KerasGRU4Rec

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4Rec Official

KerasGRU4Rec

KerasGRU4Rec

KerasGRU4Rec

Variant

Best params

KerasGRU4Rec params

KerasGRU4Rec params

OOB

Correct exp

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adam

adagrad

adagrad

constrained_embedding

True

False

False

False

False

False

embedding

0

0

0

0

0

0

final_act

softmax

softmax

softmax

softmax

softmax

softmax

layers

512

512

512

100

512

512

batch_size

32

32

32

32

32

32

dropout_p_embed

0.4

0

0

N/A

N/A

N/A

dropout_p_hidden

0.15

0.15

0.15

0.15

0.15

0.15

learning_rate

0.03

0.03

0.03

0.001

0.03

0.03

momentum

0

0

0

N/A

N/A

N/A

n_sample

2048

2048

ALL

N/A

N/A

N/A

sample_alpha

0

0

N/A

N/A

N/A

N/A

bpreg

0

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

N/A

Note

BPR-Max is not supported by KerasGRU4Rec

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

37.17

1047.47

33505.00

GRU4Rec Official

KerasGRU4Rec params

31.86

0.86 x

1222.20

39093.00

GRU4Rec Official

KerasGRU4Rec params

68.66

1.85 x

2.16 x

567.11

18139.00

KerasGRU4Rec

OOB

568.22

15.29 x

17.83 x

68.49

2191.59

KerasGRU4Rec

Correct exp

677.80

18.24 x

21.27 x

57.41

1837.26

KerasGRU4Rec

Correct full

644.59

17.34 x

20.23 x

60.38

1932.25

Retailrocket#

Note

BPR-Max is not supported by KerasGRU4Rec

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1150

0.1150

0.3026

0.1814

0.4019

0.1947

0.4953

0.2012

GRU4Rec Official

KerasGRU4Rec params

0.0981

0.0981

0.2531

0.1535

0.3338

0.1643

0.4239

0.1705

GRU4Rec Official

KerasGRU4Rec params

0.1057

0.1057

0.2601

0.1608

0.3405

0.1715

0.4200

0.1771

KerasGRU4Rec

OOB

0.0713

0.0713

0.1899

0.1124

0.2520

0.1207

0.3151

0.1250

KerasGRU4Rec

Correct exp

0.0962

0.0962

0.2440

0.1488

0.3209

0.1591

0.3938

0.1642

KerasGRU4Rec

Correct full

0.0962

0.0962

0.2463

0.1496

0.3206

0.1595

0.3957

0.1647

Note

BPR-Max is not supported by KerasGRU4Rec

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

KerasGRU4Rec params

-14.73%

-14.73%

-16.36%

-15.38%

-16.94%

-15.59%

-14.42%

-15.25%

GRU4Rec Official

KerasGRU4Rec params

-8.13%

-8.13%

-14.05%

-11.37%

-15.27%

-11.90%

-15.20%

-12.01%

KerasGRU4Rec

OOB

-37.98%

-37.98%

-37.26%

-38.05%

-37.30%

-38.03%

-36.38%

-37.88%

KerasGRU4Rec

Correct exp

-16.38%

-16.38%

-19.37%

-17.99%

-20.15%

-18.27%

-20.48%

-18.42%

KerasGRU4Rec

Correct full

-16.34%

-16.34%

-18.61%

-17.54%

-20.22%

-18.06%

-20.11%

-18.15%

Note

BPR-Max is not supported by KerasGRU4Rec

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4Rec Official

KerasGRU4Rec

KerasGRU4Rec

KerasGRU4Rec

Variant

Best params

KerasGRU4Rec params

KerasGRU4Rec params

OOB

Correct exp

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adam

adagrad

adagrad

constrained_embedding

True

False

False

False

False

False

embedding

0

0

0

0

0

0

final_act

softmax

softmax

softmax

softmax

softmax

softmax

layers

192

192

192

100

192

192

batch_size

240

240

240

240

240

240

dropout_p_embed

0.5

0

0

N/A

N/A

N/A

dropout_p_hidden

0.05

0.05

0.05

0.25

0.05

0.05

learning_rate

0.085

0.085

0.085

0.001

0.085

0.085

momentum

0.3

0

0

N/A

N/A

N/A

n_sample

2048

2048

ALL

N/A

N/A

N/A

sample_alpha

0.3

0

N/A

N/A

N/A

N/A

bpreg

0

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

N/A

Note

BPR-Max is not supported by KerasGRU4Rec

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

2.77

880.71

199935.00

GRU4Rec Official

KerasGRU4Rec params

2.20

0.79 x

1111.09

252235.00

GRU4Rec Official

KerasGRU4Rec params

71.93

25.97 x

32.70 x

33.96

7710.00

KerasGRU4Rec

OOB

276.29

99.74 x

125.59 x

8.35

2004.12

KerasGRU4Rec

Correct exp

282.69

102.05 x

128.50 x

8.16

1958.63

KerasGRU4Rec

Correct full

281.92

101.78 x

128.15 x

8.18

1964.05

Diginetica#

Note

BPR-Max is not supported by KerasGRU4Rec

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.0647

0.0647

0.2220

0.1181

0.3414

0.1339

0.4874

0.1440

GRU4Rec Official

KerasGRU4Rec params

0.0572

0.0572

0.1962

0.1042

0.3003

0.1178

0.4304

0.1268

GRU4Rec Official

KerasGRU4Rec params

0.0604

0.0604

0.2010

0.1086

0.3042

0.1223

0.4320

0.1311

KerasGRU4Rec

OOB

0.0511

0.0511

0.1723

0.0920

0.2704

0.1049

0.3922

0.1133

KerasGRU4Rec

Correct exp

0.0518

0.0518

0.1774

0.0945

0.2694

0.1066

0.3853

0.1146

KerasGRU4Rec

Correct full

0.0510

0.0510

0.1755

0.0935

0.2699

0.1060

0.3869

0.1141

Note

BPR-Max is not supported by KerasGRU4Rec

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

KerasGRU4Rec params

-11.66%

-11.66%

-11.62%

-11.83%

-12.02%

-11.98%

-11.70%

-12.00%

GRU4Rec Official

KerasGRU4Rec params

-6.72%

-6.72%

-9.43%

-8.02%

-10.89%

-8.65%

-11.37%

-8.99%

KerasGRU4Rec

OOB

-21.10%

-21.10%

-22.38%

-22.09%

-20.80%

-21.61%

-19.53%

-21.32%

KerasGRU4Rec

Correct exp

-19.91%

-19.91%

-20.08%

-20.02%

-21.08%

-20.36%

-20.94%

-20.44%

KerasGRU4Rec

Correct full

-21.27%

-21.27%

-20.94%

-20.82%

-20.92%

-20.81%

-20.61%

-20.81%

Note

BPR-Max is not supported by KerasGRU4Rec

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4Rec Official

KerasGRU4Rec

KerasGRU4Rec

KerasGRU4Rec

Variant

Best params

KerasGRU4Rec params

KerasGRU4Rec params

OOB

Correct exp

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adam

adagrad

adagrad

constrained_embedding

True

False

False

False

False

False

embedding

0

0

0

0

0

0

final_act

softmax

softmax

softmax

softmax

softmax

softmax

layers

192

192

192

192

192

192

batch_size

128

128

128

100

128

128

dropout_p_embed

0.45

0

0

N/A

N/A

N/A

dropout_p_hidden

0.15

0.15

0.15

0.25

0.15

0.15

learning_rate

0.1

0.1

0.1

0.001

0.1

0.1

momentum

0

0

0

N/A

N/A

N/A

n_sample

2048

2048

ALL

N/A

N/A

N/A

sample_alpha

0

0

N/A

N/A

N/A

N/A

bpreg

0

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

N/A

Note

BPR-Max is not supported by KerasGRU4Rec

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

4.52

1134.52

144959.00

GRU4Rec Official

KerasGRU4Rec params

4.06

0.90 x

1264.57

161575.00

GRU4Rec Official

KerasGRU4Rec params

74.84

16.56 x

18.43 x

68.58

8763.00

KerasGRU4Rec

OOB

359.31

79.49 x

88.50 x

14.26

1824.70

KerasGRU4Rec

Correct exp

355.51

78.65 x

87.56 x

14.45

1850.14

KerasGRU4Rec

Correct full

367.03

81.20 x

90.40 x

13.96

1786.86