GRU4REC-pytorch#

  • Out-of-the-box:

    Running the code on GPU required moving mean computation of a variable to the correct device.

  • Inference fix:

    The evaluation code now resets the hidden state when the corresponding session ends.

  • Major fix:
    1. Fixed the order of sampling and applying softmax transformation, as it was in the reverse order resulting in small gradients and slow convergence.

    2. Softmax transformation is now only applied once (was twice).

    3. Hidden states are now reset correctly during training. The mask governing the resets was only recalculated when a session ended, resulting in false resets.

    4. BPR-max loss is fixed to use the correct equation, but the missing score regularization was not added to algorithm.

    5. Both dropout parameters now work as expected. Dropout on the final GRU layer and embedding dropout in separate embedding mode was originally not applied.

  • Sampling is performed after all item scores are computed, which slows down training. This bug is rooted so deep in the code that we did not fix it.

Rees46#

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1027

0.1027

0.2897

0.1680

0.4027

0.1831

0.5206

0.1913

GRU4Rec Official

GRU4REC-pytorch params

0.0280

0.0280

0.1061

0.0542

0.1732

0.0631

0.2615

0.0691

GRU4REC-pytorch

OOB

0.0050

0.0050

0.0326

0.0136

0.0685

0.0183

0.1316

0.0225

GRU4REC-pytorch

OOB Correct Eval

0.0048

0.0048

0.0324

0.0134

0.0713

0.0185

0.1418

0.0232

GRU4REC-pytorch

Correct full

0.0109

0.0109

0.0522

0.0242

0.0992

0.0303

0.1764

0.0355

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1108

0.1108

0.3000

0.1772

0.4126

0.1922

0.5291

0.2003

GRU4Rec Official

GRU4REC-pytorch params

0.0340

0.0340

0.1283

0.0657

0.2066

0.0760

0.3059

0.0828

GRU4REC-pytorch

OOB

0.0578

0.0578

0.0879

0.0698

0.0955

0.0708

0.1028

0.0713

GRU4REC-pytorch

OOB Correct Eval

0.0605

0.0605

0.0945

0.0740

0.1022

0.0750

0.1093

0.0755

GRU4REC-pytorch

Correct full

0.0102

0.0102

0.0531

0.0238

0.1080

0.0309

0.1994

0.0371

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-72.74%

-72.74%

-63.38%

-67.73%

-57.00%

-65.56%

-49.76%

-63.86%

GRU4REC-pytorch

OOB

-95.17%

-95.17%

-88.74%

-91.89%

-82.98%

-90.01%

-74.72%

-88.21%

GRU4REC-pytorch

OOB Correct Eval

-95.35%

-95.35%

-88.81%

-92.01%

-82.30%

-89.92%

-72.76%

-87.86%

GRU4REC-pytorch

Correct full

-89.38%

-89.38%

-81.97%

-85.61%

-75.37%

-83.46%

-66.11%

-81.43%

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-69.35%

-69.35%

-57.23%

-62.92%

-49.92%

-60.45%

-42.18%

-58.64%

GRU4REC-pytorch

OOB

-47.82%

-47.82%

-70.70%

-60.62%

-76.85%

-63.16%

-80.57%

-64.40%

GRU4REC-pytorch

OOB Correct Eval

-45.38%

-45.38%

-68.49%

-58.26%

-75.23%

-60.98%

-79.34%

-62.31%

GRU4REC-pytorch

Correct full

-90.78%

-90.78%

-82.31%

-86.58%

-73.82%

-83.93%

-62.31%

-81.47%

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

bpr-max

bpr-max

bpr-max

bpr-max

bpr-max

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

512

512

512

512

final_act

elu-0.5

elu-0.5

elu-0.5

elu-0.5

elu-0.5

layers

512

512

512

512

512

batch_size

32

32

32

32

32

dropout_p_embed

0.1

0.1

N/A

N/A

0.1

dropout_p_hidden

0

0

N/A

N/A

0

learning_rate

0.03

0.03

0.03

0.03

0.03

momentum

0.55

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0.2

0

N/A

N/A

N/A

bpreg

0.75

0

N/A

N/A

N/A

logq

0

0

N/A

N/A

N/A

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

512

512

512

512

final_act

softmax

softmax

softmax

softmax

softmax

layers

512

512

512

512

512

batch_size

240

240

240

240

240

dropout_p_embed

0.45

0.45

N/A

N/A

0.45

dropout_p_hidden

0

0

N/A

N/A

0

learning_rate

0.065

0.065

0.065

0.065

0.065

momentum

0

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0.5

0

N/A

N/A

N/A

bpreg

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

1956.80

916.45

29326.00

GRU4Rec Official

GRU4REC-pytorch params

1492.53

0.76 x

1201.51

38448.00

GRU4REC-pytorch

OOB

29528.42

15.09 x

19.78 x

60.73

1943.38

GRU4REC-pytorch

OOB Correct Eval

29553.78

15.10 x

19.80 x

60.68

1941.71

GRU4REC-pytorch

Correct full

30497.64

15.59 x

20.43 x

58.80

1881.62

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

367.41

650.91

156189.00

GRU4Rec Official

GRU4REC-pytorch params

283.97

0.77 x

842.10

202079.00

GRU4REC-pytorch

OOB

7618.05

20.73 x

26.83 x

31.39

7532.51

GRU4REC-pytorch

OOB Correct Eval

7615.82

20.73 x

26.82 x

31.39

7534.72

GRU4REC-pytorch

Correct full

7118.13

19.37 x

25.07 x

33.59

8061.53

Yoochoose#

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1745

0.1745

0.4346

0.2675

0.5664

0.2851

0.6799

0.2931

GRU4Rec Official

GRU4REC-pytorch params

0.0988

0.0988

0.2613

0.1554

0.3620

0.1688

0.4655

0.1760

GRU4REC-pytorch

OOB

0.0002

0.0002

0.0012

0.0005

0.0031

0.0007

0.0087

0.0011

GRU4REC-pytorch

OOB Correct Eval

0.0009

0.0009

0.0104

0.0036

0.0409

0.0074

0.1066

0.0118

GRU4REC-pytorch

Correct full

0.0108

0.0108

0.0603

0.0269

0.1112

0.0335

0.1923

0.0391

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1797

0.1797

0.4457

0.2757

0.5698

0.2924

0.6804

0.3002

GRU4Rec Official

GRU4REC-pytorch params

0.0717

0.0717

0.2386

0.1301

0.3478

0.1446

0.4583

0.1523

GRU4REC-pytorch

OOB

0.0933

0.0933

0.1090

0.0998

0.1129

0.1003

0.1169

0.1006

GRU4REC-pytorch

OOB Correct Eval

0.0951

0.0951

0.1134

0.1029

0.1173

0.1034

0.1212

0.1037

GRU4REC-pytorch

Correct full

0.0493

0.0493

0.1997

0.1001

0.3052

0.1141

0.4271

0.1227

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-43.42%

-43.42%

-39.88%

-41.92%

-36.09%

-40.80%

-31.53%

-39.96%

GRU4REC-pytorch

OOB

-99.91%

-99.91%

-99.72%

-99.83%

-99.46%

-99.76%

-98.72%

-99.63%

GRU4REC-pytorch

OOB Correct Eval

-99.48%

-99.48%

-97.62%

-98.66%

-92.78%

-97.40%

-84.32%

-95.98%

GRU4REC-pytorch

Correct full

-93.80%

-93.80%

-86.13%

-89.95%

-80.36%

-88.25%

-71.72%

-86.67%

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-60.10%

-60.10%

-46.48%

-52.83%

-38.96%

-50.54%

-32.64%

-49.27%

GRU4REC-pytorch

OOB

-48.10%

-48.10%

-75.54%

-63.82%

-80.19%

-65.70%

-82.82%

-66.50%

GRU4REC-pytorch

OOB Correct Eval

-47.07%

-47.07%

-74.56%

-62.69%

-79.42%

-64.64%

-82.19%

-65.46%

GRU4REC-pytorch

Correct full

-72.58%

-72.58%

-55.19%

-63.68%

-46.44%

-60.98%

-37.22%

-59.14%

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

bpr-max

bpr-max

bpr-max

bpr-max

bpr-max

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

448

448

448

448

final_act

linear

linear

linear

linear

linear

layers

448

448

448

448

448

batch_size

48

48

48

48

48

dropout_p_embed

0.25

0.25

N/A

N/A

0.25

dropout_p_hidden

0

0

N/A

N/A

0

learning_rate

0.075

0.075

0.075

0.075

0.075

momentum

0.1

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0.2

0

N/A

N/A

N/A

bpreg

0.5

0

N/A

N/A

N/A

logq

0

0

N/A

N/A

N/A

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

480

480

480

480

final_act

softmax

softmax

softmax

softmax

softmax

layers

480

480

480

480

480

batch_size

48

48

48

48

48

dropout_p_embed

0

0

N/A

N/A

0

dropout_p_hidden

0.2

0.2

N/A

N/A

0.2

learning_rate

0.07

0.07

0.07

0.07

0.07

momentum

0

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0.2

0

N/A

N/A

N/A

bpreg

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

487.51

919.23

44121.00

GRU4Rec Official

GRU4REC-pytorch params

362.75

0.74 x

1235.40

59297.00

GRU4REC-pytorch

OOB

1854.78

3.80 x

5.11 x

241.61

11597.27

GRU4REC-pytorch

OOB Correct Eval

1857.10

3.81 x

5.12 x

241.30

11582.50

GRU4REC-pytorch

Correct full

2123.40

4.36 x

5.85 x

211.06

10130.73

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

451.75

991.99

47613.00

GRU4Rec Official

GRU4REC-pytorch params

350.38

0.78 x

1279.01

61390.00

GRU4REC-pytorch

OOB

1948.18

4.31 x

5.56 x

230.02

11040.99

GRU4REC-pytorch

OOB Correct Eval

1944.45

4.30 x

5.55 x

230.46

11062.17

GRU4REC-pytorch

Correct full

1933.25

4.28 x

5.52 x

231.80

11126.37

Coveo#

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.0501

0.0501

0.1464

0.0835

0.2172

0.0928

0.3123

0.0994

GRU4Rec Official

GRU4REC-pytorch params

0.0297

0.0297

0.0965

0.0525

0.1485

0.0593

0.2216

0.0643

GRU4REC-pytorch

OOB

0.0110

0.0110

0.0421

0.0212

0.0699

0.0248

0.1141

0.0279

GRU4REC-pytorch

OOB Correct Eval

0.0140

0.0140

0.0567

0.0281

0.0934

0.0329

0.1510

0.0368

GRU4REC-pytorch

Correct full

0.0165

0.0165

0.0626

0.0320

0.1014

0.0371

0.1582

0.0410

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.0489

0.0489

0.1418

0.0814

0.2085

0.0901

0.2947

0.0960

GRU4Rec Official

GRU4REC-pytorch params

0.0275

0.0275

0.0871

0.0478

0.1362

0.0543

0.2061

0.0591

GRU4REC-pytorch

OOB

0.0312

0.0312

0.0468

0.0370

0.0537

0.0379

0.0628

0.0386

GRU4REC-pytorch

OOB Correct Eval

0.0315

0.0315

0.0487

0.0380

0.0566

0.0390

0.0663

0.0397

GRU4REC-pytorch

Correct full

0.0222

0.0222

0.0742

0.0397

0.1175

0.0454

0.1806

0.0497

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-40.78%

-40.78%

-34.09%

-37.15%

-31.64%

-36.10%

-29.03%

-35.25%

GRU4REC-pytorch

OOB

-78.15%

-78.15%

-71.20%

-74.56%

-67.80%

-73.24%

-63.46%

-71.96%

GRU4REC-pytorch

OOB Correct Eval

-72.16%

-72.16%

-61.23%

-66.35%

-57.01%

-64.60%

-51.65%

-63.00%

GRU4REC-pytorch

Correct full

-67.09%

-67.09%

-57.23%

-61.67%

-53.33%

-60.04%

-49.33%

-58.77%

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-43.74%

-43.74%

-38.54%

-41.18%

-34.70%

-39.78%

-30.08%

-38.46%

GRU4REC-pytorch

OOB

-36.22%

-36.22%

-66.97%

-54.48%

-74.23%

-57.91%

-78.70%

-59.84%

GRU4REC-pytorch

OOB Correct Eval

-35.53%

-35.53%

-65.62%

-53.30%

-72.86%

-56.69%

-77.50%

-58.65%

GRU4REC-pytorch

Correct full

-54.53%

-54.53%

-47.65%

-51.22%

-43.66%

-49.61%

-38.73%

-48.21%

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

bpr-max

bpr-max

bpr-max

bpr-max

bpr-max

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

512

512

512

512

final_act

elu-1

elu-1

elu-1

elu-1

elu-1

layers

512

512

512

512

512

batch_size

144

144

144

144

144

dropout_p_embed

0.35

0.35

N/A

N/A

0.35

dropout_p_hidden

0

0

N/A

N/A

0

learning_rate

0.05

0.05

0.05

0.05

0.05

momentum

0.4

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0.2

0

N/A

N/A

N/A

bpreg

1.85

0

N/A

N/A

N/A

logq

0

0

N/A

N/A

N/A

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

512

512

512

512

final_act

softmax

softmax

softmax

softmax

softmax

layers

512

512

512

512

512

batch_size

32

32

32

32

32

dropout_p_embed

0.4

0.4

N/A

N/A

0.4

dropout_p_hidden

0.15

0.15

N/A

N/A

0.15

learning_rate

0.03

0.03

0.03

0.03

0.03

momentum

0

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0

0

N/A

N/A

N/A

bpreg

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

12.38

704.14

100615.00

GRU4Rec Official

GRU4REC-pytorch params

9.25

0.75 x

941.96

134690.00

GRU4REC-pytorch

OOB

23.35

1.89 x

2.52 x

370.06

53288.62

GRU4REC-pytorch

OOB Correct Eval

23.19

1.87 x

2.51 x

372.64

53660.93

GRU4REC-pytorch

Correct full

31.91

2.58 x

3.45 x

271.16

39046.98

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

37.17

1047.47

33505.00

GRU4Rec Official

GRU4REC-pytorch params

30.51

0.82 x

1276.12

40819.00

GRU4REC-pytorch

OOB

85.06

2.29 x

2.79 x

457.51

14640.34

GRU4REC-pytorch

OOB Correct Eval

86.98

2.34 x

2.85 x

447.82

14330.19

GRU4REC-pytorch

Correct full

89.90

2.42 x

2.95 x

432.89

13852.60

Retailrocket#

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1224

0.1224

0.3196

0.1928

0.4181

0.2060

0.5187

0.2131

GRU4Rec Official

GRU4REC-pytorch params

0.0560

0.0560

0.1595

0.0916

0.2294

0.1009

0.3142

0.1067

GRU4REC-pytorch

OOB

0.0096

0.0096

0.0355

0.0184

0.0544

0.0210

0.0794

0.0227

GRU4REC-pytorch

OOB Correct Eval

0.0406

0.0406

0.1341

0.0728

0.1933

0.0807

0.2528

0.0848

GRU4REC-pytorch

Correct full

0.0456

0.0456

0.1443

0.0804

0.1976

0.0875

0.2554

0.0916

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1150

0.1150

0.3026

0.1814

0.4019

0.1947

0.4953

0.2012

GRU4Rec Official

GRU4REC-pytorch params

0.0578

0.0578

0.1681

0.0955

0.2441

0.1056

0.3378

0.1121

GRU4REC-pytorch

OOB

0.0479

0.0479

0.0561

0.0511

0.0590

0.0515

0.0616

0.0517

GRU4REC-pytorch

OOB Correct Eval

0.0510

0.0510

0.0582

0.0539

0.0607

0.0543

0.0642

0.0545

GRU4REC-pytorch

Correct full

0.0492

0.0492

0.1544

0.0858

0.2163

0.0941

0.2812

0.0986

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-54.23%

-54.23%

-50.10%

-52.48%

-45.13%

-51.04%

-39.41%

-49.92%

GRU4REC-pytorch

OOB

-92.12%

-92.12%

-88.90%

-90.45%

-86.99%

-89.83%

-84.69%

-89.36%

GRU4REC-pytorch

OOB Correct Eval

-66.88%

-66.88%

-58.06%

-62.23%

-53.78%

-60.84%

-51.27%

-60.18%

GRU4REC-pytorch

Correct full

-62.76%

-62.76%

-54.84%

-58.30%

-52.74%

-57.52%

-50.76%

-57.01%

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-49.71%

-49.71%

-44.43%

-47.36%

-39.25%

-45.78%

-31.79%

-44.32%

GRU4REC-pytorch

OOB

-58.36%

-58.36%

-81.47%

-71.82%

-85.31%

-73.53%

-87.55%

-74.30%

GRU4REC-pytorch

OOB Correct Eval

-55.66%

-55.66%

-80.76%

-70.27%

-84.90%

-72.13%

-87.05%

-72.91%

GRU4REC-pytorch

Correct full

-57.19%

-57.19%

-48.99%

-52.73%

-46.19%

-51.69%

-43.23%

-51.02%

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

bpr-max

bpr-max

bpr-max

bpr-max

bpr-max

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

224

224

224

224

final_act

elu-0.5

elu-0.5

elu-0.5

elu-0.5

elu-0.5

layers

224

224

224

224

224

batch_size

80

80

80

80

80

dropout_p_embed

0.5

0.5

N/A

N/A

0.5

dropout_p_hidden

0.05

0.05

N/A

N/A

0.05

learning_rate

0.05

0.05

0.05

0.05

0.05

momentum

0.4

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0.4

0

N/A

N/A

N/A

bpreg

1.95

0

N/A

N/A

N/A

logq

0

0

N/A

N/A

N/A

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

192

192

192

192

final_act

softmax

softmax

softmax

softmax

softmax

layers

192

192

192

192

192

batch_size

240

240

240

240

240

dropout_p_embed

0.5

0.5

N/A

N/A

0.5

dropout_p_hidden

0.05

0.05

N/A

N/A

0.05

learning_rate

0.085

0.085

0.085

0.085

0.085

momentum

0.3

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0.3

0

N/A

N/A

N/A

bpreg

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

6.86

1019.34

80807.00

GRU4Rec Official

GRU4REC-pytorch params

5.22

0.76 x

1338.98

106326.00

GRU4REC-pytorch

OOB

20.53

2.99 x

3.93 x

337.68

27014.01

GRU4REC-pytorch

OOB Correct Eval

20.44

2.98 x

3.92 x

339.01

27120.93

GRU4REC-pytorch

Correct full

24.42

3.56 x

4.68 x

283.82

22705.56

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

2.77

880.71

199935.00

GRU4Rec Official

GRU4REC-pytorch params

1.89

0.68 x

1279.60

293600.00

GRU4REC-pytorch

OOB

10.10

3.65 x

5.34 x

228.56

54855.21

GRU4REC-pytorch

OOB Correct Eval

10.03

3.62 x

5.31 x

230.11

55225.62

GRU4REC-pytorch

Correct full

9.14

3.30 x

4.84 x

252.47

60592.49

Diginetica#

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.0688

0.0688

0.2304

0.1237

0.3533

0.1399

0.4995

0.1500

GRU4Rec Official

GRU4REC-pytorch params

0.0365

0.0365

0.1283

0.0675

0.2066

0.0778

0.3141

0.0851

GRU4REC-pytorch

OOB

0.0006

0.0006

0.0020

0.0011

0.0039

0.0013

0.0070

0.0015

GRU4REC-pytorch

OOB Correct Eval

0.0239

0.0239

0.0937

0.0472

0.1537

0.0551

0.2367

0.0608

GRU4REC-pytorch

Correct full

0.0277

0.0277

0.1070

0.0543

0.1747

0.0632

0.2616

0.0692

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.0647

0.0647

0.2220

0.1181

0.3414

0.1339

0.4874

0.1440

GRU4Rec Official

GRU4REC-pytorch params

0.0296

0.0296

0.1133

0.0576

0.1888

0.0675

0.2973

0.0749

GRU4REC-pytorch

OOB

0.0287

0.0287

0.0376

0.0321

0.0415

0.0327

0.0457

0.0329

GRU4REC-pytorch

OOB Correct Eval

0.0321

0.0321

0.0415

0.0357

0.0457

0.0363

0.0503

0.0366

GRU4REC-pytorch

Correct full

0.0288

0.0288

0.1135

0.0572

0.1860

0.0667

0.2862

0.0736

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-47.02%

-47.02%

-44.31%

-45.41%

-41.53%

-44.39%

-37.13%

-43.22%

GRU4REC-pytorch

OOB

-99.15%

-99.15%

-99.11%

-99.14%

-98.90%

-99.07%

-98.60%

-98.99%

GRU4REC-pytorch

OOB Correct Eval

-65.34%

-65.34%

-59.33%

-61.86%

-56.49%

-60.61%

-52.62%

-59.47%

GRU4REC-pytorch

Correct full

-59.74%

-59.74%

-53.55%

-56.08%

-50.55%

-54.81%

-47.62%

-53.87%

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-54.28%

-54.28%

-48.96%

-51.20%

-44.68%

-49.54%

-39.00%

-47.98%

GRU4REC-pytorch

OOB

-55.59%

-55.59%

-83.08%

-72.79%

-87.86%

-75.61%

-90.62%

-77.13%

GRU4REC-pytorch

OOB Correct Eval

-50.44%

-50.44%

-81.29%

-69.74%

-86.62%

-72.89%

-89.68%

-74.58%

GRU4REC-pytorch

Correct full

-55.50%

-55.50%

-48.88%

-51.59%

-45.51%

-50.15%

-41.28%

-48.90%

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

bpr-max

bpr-max

bpr-max

bpr-max

bpr-max

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

512

512

512

512

final_act

elu-1

elu-1

elu-1

elu-1

elu-1

layers

512

512

512

512

512

batch_size

128

128

128

128

128

dropout_p_embed

0.5

0.5

N/A

N/A

0.5

dropout_p_hidden

0.3

0.3

N/A

N/A

0.3

learning_rate

0.05

0.05

0.05

0.05

0.05

momentum

0.15

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0.3

0

N/A

N/A

N/A

bpreg

0.9

0

N/A

N/A

N/A

logq

0

0

N/A

N/A

N/A

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

192

192

192

192

final_act

softmax

softmax

softmax

softmax

softmax

layers

192

192

192

192

192

batch_size

128

128

128

128

128

dropout_p_embed

0.45

0.45

N/A

N/A

0.45

dropout_p_hidden

0.15

0.15

N/A

N/A

0.15

learning_rate

0.1

0.1

0.1

0.1

0.1

momentum

0

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0

0

N/A

N/A

N/A

bpreg

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

8.02

639.87

81757.00

GRU4Rec Official

GRU4REC-pytorch params

5.29

0.66 x

969.27

123869.00

GRU4REC-pytorch

OOB

32.24

4.02 x

6.09 x

158.86

20333.91

GRU4REC-pytorch

OOB Correct Eval

32.18

4.01 x

6.08 x

159.20

20378.12

GRU4REC-pytorch

Correct full

36.76

4.58 x

6.95 x

139.36

17838.75

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

4.52

1134.52

144959.00

GRU4Rec Official

GRU4REC-pytorch params

3.67

0.81 x

1398.76

178755.00

GRU4REC-pytorch

OOB

17.65

3.90 x

4.81 x

290.27

37154.98

GRU4REC-pytorch

OOB Correct Eval

17.67

3.91 x

4.81 x

289.91

37108.16

GRU4REC-pytorch

Correct full

16.97

3.75 x

4.62 x

301.78

38627.84