Rees46#

The dataset contains 8 months of user behavior data view, cart and purchase events from a multicategory e-commerce website between October 2019 and April 2020. We only use the first two months of the available data. This is similar to how real-life recommenders are usually trained only on the most recent 0.5–2 months to avoid concept drift, if the traffic is large enough. We only use view events for next item prediction. The dataset does come with precomputed sessions, but it is unclear how user histories are split into sessions. E.g. a few sessions have events from multiple users, and the time gap between subsequent events within a session can be arbitrarily long. Therefore, sessions were recomputed from the user histories using the standard 1 hour session gap threshold.

GRU4REC-pytorch#

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1027

0.1027

0.2897

0.1680

0.4027

0.1831

0.5206

0.1913

GRU4Rec Official

GRU4REC-pytorch params

0.0280

0.0280

0.1061

0.0542

0.1732

0.0631

0.2615

0.0691

GRU4REC-pytorch

OOB

0.0050

0.0050

0.0326

0.0136

0.0685

0.0183

0.1316

0.0225

GRU4REC-pytorch

OOB Correct Eval

0.0048

0.0048

0.0324

0.0134

0.0713

0.0185

0.1418

0.0232

GRU4REC-pytorch

Correct full

0.0109

0.0109

0.0522

0.0242

0.0992

0.0303

0.1764

0.0355

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1108

0.1108

0.3000

0.1772

0.4126

0.1922

0.5291

0.2003

GRU4Rec Official

GRU4REC-pytorch params

0.0340

0.0340

0.1283

0.0657

0.2066

0.0760

0.3059

0.0828

GRU4REC-pytorch

OOB

0.0578

0.0578

0.0879

0.0698

0.0955

0.0708

0.1028

0.0713

GRU4REC-pytorch

OOB Correct Eval

0.0605

0.0605

0.0945

0.0740

0.1022

0.0750

0.1093

0.0755

GRU4REC-pytorch

Correct full

0.0102

0.0102

0.0531

0.0238

0.1080

0.0309

0.1994

0.0371

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-72.74%

-72.74%

-63.38%

-67.73%

-57.00%

-65.56%

-49.76%

-63.86%

GRU4REC-pytorch

OOB

-95.17%

-95.17%

-88.74%

-91.89%

-82.98%

-90.01%

-74.72%

-88.21%

GRU4REC-pytorch

OOB Correct Eval

-95.35%

-95.35%

-88.81%

-92.01%

-82.30%

-89.92%

-72.76%

-87.86%

GRU4REC-pytorch

Correct full

-89.38%

-89.38%

-81.97%

-85.61%

-75.37%

-83.46%

-66.11%

-81.43%

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4REC-pytorch params

-69.35%

-69.35%

-57.23%

-62.92%

-49.92%

-60.45%

-42.18%

-58.64%

GRU4REC-pytorch

OOB

-47.82%

-47.82%

-70.70%

-60.62%

-76.85%

-63.16%

-80.57%

-64.40%

GRU4REC-pytorch

OOB Correct Eval

-45.38%

-45.38%

-68.49%

-58.26%

-75.23%

-60.98%

-79.34%

-62.31%

GRU4REC-pytorch

Correct full

-90.78%

-90.78%

-82.31%

-86.58%

-73.82%

-83.93%

-62.31%

-81.47%

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

bpr-max

bpr-max

bpr-max

bpr-max

bpr-max

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

512

512

512

512

final_act

elu-0.5

elu-0.5

elu-0.5

elu-0.5

elu-0.5

layers

512

512

512

512

512

batch_size

32

32

32

32

32

dropout_p_embed

0.1

0.1

N/A

N/A

0.1

dropout_p_hidden

0

0

N/A

N/A

0

learning_rate

0.03

0.03

0.03

0.03

0.03

momentum

0.55

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0.2

0

N/A

N/A

N/A

bpreg

0.75

0

N/A

N/A

N/A

logq

0

0

N/A

N/A

N/A

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4REC-pytorch

GRU4REC-pytorch

GRU4REC-pytorch

Variant

Best params

GRU4REC-pytorch params

OOB

OOB Correct Eval

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

512

512

512

512

final_act

softmax

softmax

softmax

softmax

softmax

layers

512

512

512

512

512

batch_size

240

240

240

240

240

dropout_p_embed

0.45

0.45

N/A

N/A

0.45

dropout_p_hidden

0

0

N/A

N/A

0

learning_rate

0.065

0.065

0.065

0.065

0.065

momentum

0

0

N/A

N/A

N/A

n_sample

2048

0

N/A

N/A

N/A

sample_alpha

0.5

0

N/A

N/A

N/A

bpreg

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

1956.80

916.45

29326.00

GRU4Rec Official

GRU4REC-pytorch params

1492.53

0.76 x

1201.51

38448.00

GRU4REC-pytorch

OOB

29528.42

15.09 x

19.78 x

60.73

1943.38

GRU4REC-pytorch

OOB Correct Eval

29553.78

15.10 x

19.80 x

60.68

1941.71

GRU4REC-pytorch

Correct full

30497.64

15.59 x

20.43 x

58.80

1881.62

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

367.41

650.91

156189.00

GRU4Rec Official

GRU4REC-pytorch params

283.97

0.77 x

842.10

202079.00

GRU4REC-pytorch

OOB

7618.05

20.73 x

26.83 x

31.39

7532.51

GRU4REC-pytorch

OOB Correct Eval

7615.82

20.73 x

26.82 x

31.39

7534.72

GRU4REC-pytorch

Correct full

7118.13

19.37 x

25.07 x

33.59

8061.53

Torch-GRU4Rec#

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1027

0.1027

0.2897

0.1680

0.4027

0.1831

0.5206

0.1913

GRU4Rec Official

Torch-GRU4Rec params

0.0968

0.0968

0.2801

0.1607

0.3923

0.1757

0.5112

0.1839

Torch-GRU4Rec

OOB

0.0954

0.0954

0.2774

0.1588

0.3894

0.1737

0.5081

0.1820

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1108

0.1108

0.3000

0.1772

0.4126

0.1922

0.5291

0.2003

GRU4Rec Official

Torch-GRU4Rec params

0.0825

0.0825

0.2484

0.1401

0.3551

0.1543

0.4716

0.1624

Torch-GRU4Rec

OOB

0.0814

0.0814

0.2459

0.1383

0.3531

0.1525

0.4688

0.1606

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

Torch-GRU4Rec params

-5.78%

-5.78%

-3.34%

-4.34%

-2.59%

-4.04%

-1.80%

-3.83%

Torch-GRU4Rec

OOB

-7.12%

-7.12%

-4.28%

-5.50%

-3.32%

-5.11%

-2.40%

-4.86%

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

Torch-GRU4Rec params

-25.58%

-25.58%

-17.21%

-20.94%

-13.94%

-19.73%

-10.87%

-18.94%

Torch-GRU4Rec

OOB

-26.58%

-26.58%

-18.05%

-21.96%

-14.42%

-20.63%

-11.40%

-19.83%

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

Torch-GRU4Rec

Variant

Best params

Torch-GRU4Rec params

OOB

loss

bpr-max

bpr-max

bpr-max

optim

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

embedding

0

512

512

final_act

elu-0.5

elu-0.5

elu-0.5

layers

512

512

512

batch_size

32

32

32

dropout_p_embed

0.1

0.1

0.1

dropout_p_hidden

0

0

0

learning_rate

0.03

0.03

0.03

momentum

0.55

0

N/A

n_sample

2048

2048

2048

sample_alpha

0.2

0.2

0.2

bpreg

0.75

0.75

0.75

logq

0

0

N/A

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

Torch-GRU4Rec

Variant

Best params

Torch-GRU4Rec params

OOB

loss

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

embedding

0

512

512

final_act

softmax

softmax

softmax

layers

512

512

512

batch_size

240

240

240

dropout_p_embed

0.45

0.45

0.45

dropout_p_hidden

0

0

0

learning_rate

0.065

0.065

0.065

momentum

0

0

N/A

n_sample

2048

2048

2048

sample_alpha

0.5

0.5

0.5

bpreg

0

0

0

logq

1

0

N/A

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

1956.80

916.45

29326.00

GRU4Rec Official

Torch-GRU4Rec params

1816.76

0.93 x

987.09

31587.00

Torch-GRU4Rec

OOB

30689.54

15.68 x

16.89 x

58.43

1869.86

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

367.41

650.91

156189.00

GRU4Rec Official

Torch-GRU4Rec params

381.71

1.04 x

626.53

150339.00

Torch-GRU4Rec

OOB

7192.88

19.58 x

18.84 x

33.25

7978.05

Recpack#

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1027

0.1027

0.2897

0.1680

0.4027

0.1831

0.5206

0.1913

GRU4Rec Official

Recpack params

0.0896

0.0896

0.2654

0.1508

0.3787

0.1658

0.5010

0.1743

Recpack

OOB

0.0762

0.0762

0.2346

0.1311

0.3405

0.1451

0.4624

0.1536

Recpack

Correct exp

0.0767

0.0767

0.2364

0.1321

0.3417

0.1461

0.4624

0.1544

Recpack

Correct full

0.0774

0.0774

0.2399

0.1338

0.3458

0.1478

0.4666

0.1562

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1108

0.1108

0.3000

0.1772

0.4126

0.1922

0.5291

0.2003

GRU4Rec Official

Recpack params

0.0840

0.0840

0.2531

0.1427

0.3625

0.1572

0.4812

0.1655

GRU4Rec Official

Recpack params

0.1078

0.1078

0.2937

0.1729

0.4034

0.1875

0.5174

0.1955

Recpack

OOB

0.0740

0.0740

0.2292

0.1281

0.3281

0.1412

0.4412

0.1490

Recpack

Correct exp

0.0754

0.0754

0.2304

0.1294

0.3288

0.1424

0.4414

0.1502

Recpack

Correct full

0.0754

0.0754

0.2299

0.1289

0.3294

0.1421

0.4386

0.1496

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

Recpack params

-12.73%

-12.73%

-8.41%

-10.27%

-5.98%

-9.42%

-3.76%

-8.85%

Recpack

OOB

-25.79%

-25.79%

-19.03%

-21.99%

-15.45%

-20.73%

-11.17%

-19.71%

Recpack

Correct exp

-25.30%

-25.30%

-18.41%

-21.37%

-15.16%

-20.21%

-11.17%

-19.25%

Recpack

Correct full

-24.63%

-24.63%

-17.21%

-20.38%

-14.15%

-19.26%

-10.37%

-18.34%

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

Recpack params

-24.21%

-24.21%

-15.63%

-19.47%

-12.16%

-18.19%

-9.05%

-17.39%

GRU4Rec Official

Recpack params

-2.76%

-2.76%

-2.10%

-2.40%

-2.23%

-2.42%

-2.22%

-2.41%

Recpack

OOB

-33.20%

-33.20%

-23.61%

-27.73%

-20.48%

-26.53%

-16.61%

-25.60%

Recpack

Correct exp

-31.97%

-31.97%

-23.19%

-26.98%

-20.31%

-25.89%

-16.59%

-25.00%

Recpack

Correct full

-31.99%

-31.99%

-23.36%

-27.27%

-20.18%

-26.07%

-17.11%

-25.29%

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

Recpack

Recpack

Recpack

Variant

Best params

Recpack params

OOB

Correct exp

Correct full

loss

bpr-max

bpr-max

bpr-max

bpr-max

bpr-max

optim

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

embedding

0

512

512

512

512

final_act

elu-0.5

linear

N/A

N/A

N/A

layers

512

512

512

512

512

batch_size

32

32

32

32

32

dropout_p_embed

0.1

0.1

0.1

0.1

0.1

dropout_p_hidden

0

0

0.1

0.1

0

learning_rate

0.03

0.03

0.03

0.03

0.03

momentum

0.55

0

N/A

N/A

N/A

n_sample

2048

2048

2048

2048

2048

sample_alpha

0.2

0

N/A

N/A

N/A

bpreg

0.75

0.75

0.75

0.75

0.75

logq

0

0

N/A

N/A

N/A

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4Rec Official

Recpack

Recpack

Recpack

Variant

Best params

Recpack params

Recpack params

OOB

Correct exp

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

False

False

embedding

0

512

512

512

512

512

final_act

softmax

softmax

softmax

softmax

softmax

softmax

layers

512

512

512

512

512

512

batch_size

240

240

240

240

240

240

dropout_p_embed

0.45

0.45

0.45

0.45

0.45

0.45

dropout_p_hidden

0

0

0

0.45

0.45

0

learning_rate

0.065

0.065

0.065

0.065

0.065

0.065

momentum

0

0

0

N/A

N/A

N/A

n_sample

2048

2048

ALL

N/A

N/A

N/A

sample_alpha

0.5

0

N/A

N/A

N/A

N/A

bpreg

0

0

0

0

0

0

logq

1

0

N/A

N/A

N/A

N/A

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

1956.80

916.45

29326.00

GRU4Rec Official

Recpack params

1786.73

0.91 x

1003.68

32118.00

Recpack

OOB

179834.54

91.90 x

100.65 x

55.21

319.10

Recpack

Correct exp

179486.17

91.72 x

100.46 x

55.32

319.72

Recpack

Correct full

180173.69

92.08 x

100.84 x

55.10

318.50

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

367.41

650.91

156189.00

GRU4Rec Official

Recpack params

382.60

1.04 x

625.07

149988.00

GRU4Rec Official

Recpack params

36064.38

98.16 x

94.26 x

6.63

1591.00

Recpack

OOB

49262.16

134.08 x

128.76 x

55.91

1164.89

Recpack

Correct exp

49366.03

134.36 x

129.03 x

55.79

1162.44

Recpack

Correct full

49253.13

134.05 x

128.73 x

55.92

1165.11

GRU4Rec_Tensorflow#

Note

BPR-Max is not supported by GRU4Rec_Tensorflow

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1108

0.1108

0.3000

0.1772

0.4126

0.1922

0.5291

0.2003

GRU4Rec Official

GRU4Rec_Tensorflow params

0.0304

0.0304

0.1178

0.0596

0.1950

0.0697

0.2940

0.0765

GRU4Rec_Tensorflow

OOB

0.0270

0.0270

0.0895

0.0482

0.1403

0.0549

0.2066

0.0594

GRU4Rec_Tensorflow

Correct Exp

0.0251

0.0251

0.1026

0.0507

0.1747

0.0602

0.2698

0.0667

Note

BPR-Max is not supported by GRU4Rec_Tensorflow

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

GRU4Rec_Tensorflow params

-72.60%

-72.60%

-60.75%

-66.39%

-52.75%

-63.75%

-44.43%

-61.81%

GRU4Rec_Tensorflow

OOB

-75.62%

-75.62%

-70.17%

-72.81%

-65.99%

-71.45%

-60.95%

-70.33%

GRU4Rec_Tensorflow

Correct Exp

-77.39%

-77.39%

-65.79%

-71.36%

-57.67%

-68.69%

-49.00%

-66.69%

Note

BPR-Max is not supported by GRU4Rec_Tensorflow

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4Rec_Tensorflow

GRU4Rec_Tensorflow

Variant

Best params

GRU4Rec_Tensorflow params

OOB

Correct Exp

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adagrad

constrained_embedding

True

False

False

False

embedding

0

512

512

512

final_act

softmax

softmax

softmax

softmax

layers

512

512

512

512

batch_size

240

240

50

240

dropout_p_embed

0.45

0

N/A

N/A

dropout_p_hidden

0

0

0

0

learning_rate

0.065

0.065

0.065

0.065

momentum

0

0

N/A

N/A

n_sample

2048

0

N/A

N/A

sample_alpha

0.5

0

N/A

N/A

bpreg

0

0

N/A

N/A

logq

1

0

N/A

N/A

Note

BPR-Max is not supported by GRU4Rec_Tensorflow

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

367.41

650.91

156189.00

GRU4Rec Official

GRU4Rec_Tensorflow params

280.85

0.76 x

851.46

204324.00

GRU4Rec_Tensorflow

OOB

2381.62

6.48 x

8.48 x

481.90

24094.84

GRU4Rec_Tensorflow

Correct Exp

531.14

1.45 x

1.89 x

450.16

108037.24

KerasGRU4Rec#

Note

BPR-Max is not supported by KerasGRU4Rec

Metrics#

Implementation

Variant

Recall@1

MRR@1

Recall@5

MRR@5

Recall@10

MRR@10

Recall@20

MRR@20

GRU4Rec Official

Best params

0.1108

0.1108

0.3000

0.1772

0.4126

0.1922

0.5291

0.2003

GRU4Rec Official

KerasGRU4Rec params

0.0918

0.0918

0.2639

0.1518

0.3713

0.1660

0.4860

0.1740

GRU4Rec Official

KerasGRU4Rec params

0.1097

0.1097

0.2946

0.1746

0.4029

0.1890

0.5152

0.1969

KerasGRU4Rec

OOB

0.0805

0.0805

0.2394

0.1354

0.3444

0.1494

0.4608

0.1574

KerasGRU4Rec

Correct exp

0.1027

0.1027

0.2799

0.1648

0.3864

0.1790

0.4979

0.1868

KerasGRU4Rec

Correct full

0.1027

0.1027

0.2796

0.1647

0.3859

0.1788

0.4978

0.1866

Note

BPR-Max is not supported by KerasGRU4Rec

Metric difference compared to the “Best params” version with the corresponding loss#

Implementation

Variant

Recall@1 Diff

MRR@1 Diff

Recall@5 Diff

MRR@5 Diff

Recall@10 Diff

MRR@10 Diff

Recall@20 Diff

MRR@20 Diff

GRU4Rec Official

Best params

GRU4Rec Official

KerasGRU4Rec params

-17.15%

-17.15%

-12.03%

-14.35%

-10.03%

-13.61%

-8.15%

-13.13%

GRU4Rec Official

KerasGRU4Rec params

-1.07%

-1.07%

-1.79%

-1.45%

-2.35%

-1.63%

-2.64%

-1.72%

KerasGRU4Rec

OOB

-27.35%

-27.35%

-20.21%

-23.56%

-16.54%

-22.28%

-12.91%

-21.39%

KerasGRU4Rec

Correct exp

-7.32%

-7.32%

-6.70%

-6.96%

-6.35%

-6.85%

-5.89%

-6.75%

KerasGRU4Rec

Correct full

-7.34%

-7.34%

-6.80%

-7.06%

-6.48%

-6.96%

-5.92%

-6.84%

Note

BPR-Max is not supported by KerasGRU4Rec

Hyperparameters used in the experiment#

GRU4Rec Official

GRU4Rec Official

GRU4Rec Official

KerasGRU4Rec

KerasGRU4Rec

KerasGRU4Rec

Variant

Best params

KerasGRU4Rec params

KerasGRU4Rec params

OOB

Correct exp

Correct full

loss

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

cross-entropy

optim

adagrad

adagrad

adagrad

adam

adagrad

adagrad

constrained_embedding

True

False

False

False

False

False

embedding

0

0

0

0

0

0

final_act

softmax

softmax

softmax

softmax

softmax

softmax

layers

512

512

512

100

512

512

batch_size

240

240

240

240

240

240

dropout_p_embed

0.45

0

0

N/A

N/A

N/A

dropout_p_hidden

0

0

0

0.25

0

0

learning_rate

0.065

0.065

0.065

0.001

0.065

0.065

momentum

0

0

0

N/A

N/A

N/A

n_sample

2048

2048

ALL

N/A

N/A

N/A

sample_alpha

0.5

0

N/A

N/A

N/A

N/A

bpreg

0

0

0

N/A

N/A

N/A

logq

1

0

N/A

N/A

N/A

N/A

Note

BPR-Max is not supported by KerasGRU4Rec

Runtime metrics#

Implementation

Variant

Avg. epoch time (s)

Avg. epoch time to Best

Avg. epoch time to Matching

Avg. mb/s

Avg. e/s

GRU4Rec Official

Best params

367.41

650.91

156189.00

GRU4Rec Official

KerasGRU4Rec params

306.43

0.83 x

780.44

187270.00

GRU4Rec Official

KerasGRU4Rec params

35765.38

97.34 x

116.72 x

6.69

1604.00

KerasGRU4Rec

OOB

123265.62

335.50 x

402.26 x

1.94

465.52

KerasGRU4Rec

Correct exp

130917.81

356.33 x

427.24 x

1.83

438.31

KerasGRU4Rec

Correct full

130987.98

356.52 x

427.46 x

1.83

438.08