GRU4REC-pytorch#

Out-of-the-box:
Running the code on GPU required moving mean computation of a variable to the correct device.
Inference fix:
The evaluation code now resets the hidden state when the corresponding session ends.
Major fix:
1. Fixed the order of sampling and applying softmax transformation, as it was in the reverse order resulting in small gradients and slow convergence.
2. Softmax transformation is now only applied once (was twice).
3. Hidden states are now reset correctly during training. The mask governing the resets was only recalculated when a session ended, resulting in false resets.
4. BPR-max loss is fixed to use the correct equation, but the missing score regularization was not added to algorithm.
5. Both dropout parameters now work as expected. Dropout on the final GRU layer and embedding dropout in separate embedding mode was originally not applied.
Sampling is performed after all item scores are computed, which slows down training. This bug is rooted so deep in the code that we did not fix it.

Rees46#

Metric

BPR-Max

Metrics#
Implementation	Variant	Recall@1	MRR@1	Recall@5	MRR@5	Recall@10	MRR@10	Recall@20	MRR@20
GRU4Rec Official	Best params	0.1027	0.1027	0.2897	0.1680	0.4027	0.1831	0.5206	0.1913
GRU4Rec Official	GRU4REC-pytorch params	0.0280	0.0280	0.1061	0.0542	0.1732	0.0631	0.2615	0.0691
GRU4REC-pytorch	OOB	0.0050	0.0050	0.0326	0.0136	0.0685	0.0183	0.1316	0.0225
GRU4REC-pytorch	OOB Correct Eval	0.0048	0.0048	0.0324	0.0134	0.0713	0.0185	0.1418	0.0232
GRU4REC-pytorch	Correct full	0.0109	0.0109	0.0522	0.0242	0.0992	0.0303	0.1764	0.0355

Cross-Entropy

Metrics#
Implementation	Variant	Recall@1	MRR@1	Recall@5	MRR@5	Recall@10	MRR@10	Recall@20	MRR@20
GRU4Rec Official	Best params	0.1108	0.1108	0.3000	0.1772	0.4126	0.1922	0.5291	0.2003
GRU4Rec Official	GRU4REC-pytorch params	0.0340	0.0340	0.1283	0.0657	0.2066	0.0760	0.3059	0.0828
GRU4REC-pytorch	OOB	0.0578	0.0578	0.0879	0.0698	0.0955	0.0708	0.1028	0.0713
GRU4REC-pytorch	OOB Correct Eval	0.0605	0.0605	0.0945	0.0740	0.1022	0.0750	0.1093	0.0755
GRU4REC-pytorch	Correct full	0.0102	0.0102	0.0531	0.0238	0.1080	0.0309	0.1994	0.0371

Metric Diff

BPR-Max

Metric difference compared to the “Best params” version with the corresponding loss#
Implementation	Variant	Recall@1 Diff	MRR@1 Diff	Recall@5 Diff	MRR@5 Diff	Recall@10 Diff	MRR@10 Diff	Recall@20 Diff	MRR@20 Diff
GRU4Rec Official	Best params
GRU4Rec Official	GRU4REC-pytorch params	-72.74%	-72.74%	-63.38%	-67.73%	-57.00%	-65.56%	-49.76%	-63.86%
GRU4REC-pytorch	OOB	-95.17%	-95.17%	-88.74%	-91.89%	-82.98%	-90.01%	-74.72%	-88.21%
GRU4REC-pytorch	OOB Correct Eval	-95.35%	-95.35%	-88.81%	-92.01%	-82.30%	-89.92%	-72.76%	-87.86%
GRU4REC-pytorch	Correct full	-89.38%	-89.38%	-81.97%	-85.61%	-75.37%	-83.46%	-66.11%	-81.43%

Cross-Entropy

Metric difference compared to the “Best params” version with the corresponding loss#
Implementation	Variant	Recall@1 Diff	MRR@1 Diff	Recall@5 Diff	MRR@5 Diff	Recall@10 Diff	MRR@10 Diff	Recall@20 Diff	MRR@20 Diff
GRU4Rec Official	Best params
GRU4Rec Official	GRU4REC-pytorch params	-69.35%	-69.35%	-57.23%	-62.92%	-49.92%	-60.45%	-42.18%	-58.64%
GRU4REC-pytorch	OOB	-47.82%	-47.82%	-70.70%	-60.62%	-76.85%	-63.16%	-80.57%	-64.40%
GRU4REC-pytorch	OOB Correct Eval	-45.38%	-45.38%	-68.49%	-58.26%	-75.23%	-60.98%	-79.34%	-62.31%
GRU4REC-pytorch	Correct full	-90.78%	-90.78%	-82.31%	-86.58%	-73.82%	-83.93%	-62.31%	-81.47%

Hyperparameters

BPR-Max

Hyperparameters used in the experiment#
	GRU4Rec Official	GRU4Rec Official	GRU4REC-pytorch	GRU4REC-pytorch	GRU4REC-pytorch
Variant	Best params	GRU4REC-pytorch params	OOB	OOB Correct Eval	Correct full
loss	bpr-max	bpr-max	bpr-max	bpr-max	bpr-max
optim	adagrad	adagrad	adagrad	adagrad	adagrad
constrained_embedding	True	False	False	False	False
embedding	0	512	512	512	512
final_act	elu-0.5	elu-0.5	elu-0.5	elu-0.5	elu-0.5
layers	512	512	512	512	512
batch_size	32	32	32	32	32
dropout_p_embed	0.1	0.1	N/A	N/A	0.1
dropout_p_hidden	0	0	N/A	N/A	0
learning_rate	0.03	0.03	0.03	0.03	0.03
momentum	0.55	0	N/A	N/A	N/A
n_sample	2048	0	N/A	N/A	N/A
sample_alpha	0.2	0	N/A	N/A	N/A
bpreg	0.75	0	N/A	N/A	N/A
logq	0	0	N/A	N/A	N/A

Cross-Entropy

Hyperparameters used in the experiment#
	GRU4Rec Official	GRU4Rec Official	GRU4REC-pytorch	GRU4REC-pytorch	GRU4REC-pytorch
Variant	Best params	GRU4REC-pytorch params	OOB	OOB Correct Eval	Correct full
loss	cross-entropy	cross-entropy	cross-entropy	cross-entropy	cross-entropy
optim	adagrad	adagrad	adagrad	adagrad	adagrad
constrained_embedding	True	False	False	False	False
embedding	0	512	512	512	512
final_act	softmax	softmax	softmax	softmax	softmax
layers	512	512	512	512	512
batch_size	240	240	240	240	240
dropout_p_embed	0.45	0.45	N/A	N/A	0.45
dropout_p_hidden	0	0	N/A	N/A	0
learning_rate	0.065	0.065	0.065	0.065	0.065
momentum	0	0	N/A	N/A	N/A
n_sample	2048	0	N/A	N/A	N/A
sample_alpha	0.5	0	N/A	N/A	N/A
bpreg	0	0	N/A	N/A	N/A
logq	1	0	N/A	N/A	N/A

Runtimes

BPR-Max

Runtime metrics#
Implementation	Variant	Avg. epoch time (s)	Avg. epoch time to Best	Avg. epoch time to Matching	Avg. mb/s	Avg. e/s
GRU4Rec Official	Best params	1956.80			916.45	29326.00
GRU4Rec Official	GRU4REC-pytorch params	1492.53	0.76 x		1201.51	38448.00
GRU4REC-pytorch	OOB	29528.42	15.09 x	19.78 x	60.73	1943.38
GRU4REC-pytorch	OOB Correct Eval	29553.78	15.10 x	19.80 x	60.68	1941.71
GRU4REC-pytorch	Correct full	30497.64	15.59 x	20.43 x	58.80	1881.62

Cross-Entropy

Runtime metrics#
Implementation	Variant	Avg. epoch time (s)	Avg. epoch time to Best	Avg. epoch time to Matching	Avg. mb/s	Avg. e/s
GRU4Rec Official	Best params	367.41			650.91	156189.00
GRU4Rec Official	GRU4REC-pytorch params	283.97	0.77 x		842.10	202079.00
GRU4REC-pytorch	OOB	7618.05	20.73 x	26.83 x	31.39	7532.51
GRU4REC-pytorch	OOB Correct Eval	7615.82	20.73 x	26.82 x	31.39	7534.72
GRU4REC-pytorch	Correct full	7118.13	19.37 x	25.07 x	33.59	8061.53

Yoochoose#

Metric

BPR-Max

Metrics#
Implementation	Variant	Recall@1	MRR@1	Recall@5	MRR@5	Recall@10	MRR@10	Recall@20	MRR@20
GRU4Rec Official	Best params	0.1745	0.1745	0.4346	0.2675	0.5664	0.2851	0.6799	0.2931
GRU4Rec Official	GRU4REC-pytorch params	0.0988	0.0988	0.2613	0.1554	0.3620	0.1688	0.4655	0.1760
GRU4REC-pytorch	OOB	0.0002	0.0002	0.0012	0.0005	0.0031	0.0007	0.0087	0.0011
GRU4REC-pytorch	OOB Correct Eval	0.0009	0.0009	0.0104	0.0036	0.0409	0.0074	0.1066	0.0118
GRU4REC-pytorch	Correct full	0.0108	0.0108	0.0603	0.0269	0.1112	0.0335	0.1923	0.0391

Cross-Entropy

Metrics#
Implementation	Variant	Recall@1	MRR@1	Recall@5	MRR@5	Recall@10	MRR@10	Recall@20	MRR@20
GRU4Rec Official	Best params	0.1797	0.1797	0.4457	0.2757	0.5698	0.2924	0.6804	0.3002
GRU4Rec Official	GRU4REC-pytorch params	0.0717	0.0717	0.2386	0.1301	0.3478	0.1446	0.4583	0.1523
GRU4REC-pytorch	OOB	0.0933	0.0933	0.1090	0.0998	0.1129	0.1003	0.1169	0.1006
GRU4REC-pytorch	OOB Correct Eval	0.0951	0.0951	0.1134	0.1029	0.1173	0.1034	0.1212	0.1037
GRU4REC-pytorch	Correct full	0.0493	0.0493	0.1997	0.1001	0.3052	0.1141	0.4271	0.1227

Metric Diff

BPR-Max

Metric difference compared to the “Best params” version with the corresponding loss#
Implementation	Variant	Recall@1 Diff	MRR@1 Diff	Recall@5 Diff	MRR@5 Diff	Recall@10 Diff	MRR@10 Diff	Recall@20 Diff	MRR@20 Diff
GRU4Rec Official	Best params
GRU4Rec Official	GRU4REC-pytorch params	-43.42%	-43.42%	-39.88%	-41.92%	-36.09%	-40.80%	-31.53%	-39.96%
GRU4REC-pytorch	OOB	-99.91%	-99.91%	-99.72%	-99.83%	-99.46%	-99.76%	-98.72%	-99.63%
GRU4REC-pytorch	OOB Correct Eval	-99.48%	-99.48%	-97.62%	-98.66%	-92.78%	-97.40%	-84.32%	-95.98%
GRU4REC-pytorch	Correct full	-93.80%	-93.80%	-86.13%	-89.95%	-80.36%	-88.25%	-71.72%	-86.67%

Cross-Entropy

Metric difference compared to the “Best params” version with the corresponding loss#
Implementation	Variant	Recall@1 Diff	MRR@1 Diff	Recall@5 Diff	MRR@5 Diff	Recall@10 Diff	MRR@10 Diff	Recall@20 Diff	MRR@20 Diff
GRU4Rec Official	Best params
GRU4Rec Official	GRU4REC-pytorch params	-60.10%	-60.10%	-46.48%	-52.83%	-38.96%	-50.54%	-32.64%	-49.27%
GRU4REC-pytorch	OOB	-48.10%	-48.10%	-75.54%	-63.82%	-80.19%	-65.70%	-82.82%	-66.50%
GRU4REC-pytorch	OOB Correct Eval	-47.07%	-47.07%	-74.56%	-62.69%	-79.42%	-64.64%	-82.19%	-65.46%
GRU4REC-pytorch	Correct full	-72.58%	-72.58%	-55.19%	-63.68%	-46.44%	-60.98%	-37.22%	-59.14%

Hyperparameters

BPR-Max

Hyperparameters used in the experiment#
	GRU4Rec Official	GRU4Rec Official	GRU4REC-pytorch	GRU4REC-pytorch	GRU4REC-pytorch
Variant	Best params	GRU4REC-pytorch params	OOB	OOB Correct Eval	Correct full
loss	bpr-max	bpr-max	bpr-max	bpr-max	bpr-max
optim	adagrad	adagrad	adagrad	adagrad	adagrad
constrained_embedding	True	False	False	False	False
embedding	0	448	448	448	448
final_act	linear	linear	linear	linear	linear
layers	448	448	448	448	448
batch_size	48	48	48	48	48
dropout_p_embed	0.25	0.25	N/A	N/A	0.25
dropout_p_hidden	0	0	N/A	N/A	0
learning_rate	0.075	0.075	0.075	0.075	0.075
momentum	0.1	0	N/A	N/A	N/A
n_sample	2048	0	N/A	N/A	N/A
sample_alpha	0.2	0	N/A	N/A	N/A
bpreg	0.5	0	N/A	N/A	N/A
logq	0	0	N/A	N/A	N/A

Cross-Entropy

Hyperparameters used in the experiment#
	GRU4Rec Official	GRU4Rec Official	GRU4REC-pytorch	GRU4REC-pytorch	GRU4REC-pytorch
Variant	Best params	GRU4REC-pytorch params	OOB	OOB Correct Eval	Correct full
loss	cross-entropy	cross-entropy	cross-entropy	cross-entropy	cross-entropy
optim	adagrad	adagrad	adagrad	adagrad	adagrad
constrained_embedding	True	False	False	False	False
embedding	0	480	480	480	480
final_act	softmax	softmax	softmax	softmax	softmax
layers	480	480	480	480	480
batch_size	48	48	48	48	48
dropout_p_embed	0	0	N/A	N/A	0
dropout_p_hidden	0.2	0.2	N/A	N/A	0.2
learning_rate	0.07	0.07	0.07	0.07	0.07
momentum	0	0	N/A	N/A	N/A
n_sample	2048	0	N/A	N/A	N/A
sample_alpha	0.2	0	N/A	N/A	N/A
bpreg	0	0	N/A	N/A	N/A
logq	1	0	N/A	N/A	N/A

Runtimes

BPR-Max

Runtime metrics#
Implementation	Variant	Avg. epoch time (s)	Avg. epoch time to Best	Avg. epoch time to Matching	Avg. mb/s	Avg. e/s
GRU4Rec Official	Best params	487.51			919.23	44121.00
GRU4Rec Official	GRU4REC-pytorch params	362.75	0.74 x		1235.40	59297.00
GRU4REC-pytorch	OOB	1854.78	3.80 x	5.11 x	241.61	11597.27
GRU4REC-pytorch	OOB Correct Eval	1857.10	3.81 x	5.12 x	241.30	11582.50
GRU4REC-pytorch	Correct full	2123.40	4.36 x	5.85 x	211.06	10130.73

Cross-Entropy

Runtime metrics#
Implementation	Variant	Avg. epoch time (s)	Avg. epoch time to Best	Avg. epoch time to Matching	Avg. mb/s	Avg. e/s
GRU4Rec Official	Best params	451.75			991.99	47613.00
GRU4Rec Official	GRU4REC-pytorch params	350.38	0.78 x		1279.01	61390.00
GRU4REC-pytorch	OOB	1948.18	4.31 x	5.56 x	230.02	11040.99
GRU4REC-pytorch	OOB Correct Eval	1944.45	4.30 x	5.55 x	230.46	11062.17
GRU4REC-pytorch	Correct full	1933.25	4.28 x	5.52 x	231.80	11126.37

Coveo#

Metric

BPR-Max

Metrics#
Implementation	Variant	Recall@1	MRR@1	Recall@5	MRR@5	Recall@10	MRR@10	Recall@20	MRR@20
GRU4Rec Official	Best params	0.0501	0.0501	0.1464	0.0835	0.2172	0.0928	0.3123	0.0994
GRU4Rec Official	GRU4REC-pytorch params	0.0297	0.0297	0.0965	0.0525	0.1485	0.0593	0.2216	0.0643
GRU4REC-pytorch	OOB	0.0110	0.0110	0.0421	0.0212	0.0699	0.0248	0.1141	0.0279
GRU4REC-pytorch	OOB Correct Eval	0.0140	0.0140	0.0567	0.0281	0.0934	0.0329	0.1510	0.0368
GRU4REC-pytorch	Correct full	0.0165	0.0165	0.0626	0.0320	0.1014	0.0371	0.1582	0.0410

Cross-Entropy

Metrics#
Implementation	Variant	Recall@1	MRR@1	Recall@5	MRR@5	Recall@10	MRR@10	Recall@20	MRR@20
GRU4Rec Official	Best params	0.0489	0.0489	0.1418	0.0814	0.2085	0.0901	0.2947	0.0960
GRU4Rec Official	GRU4REC-pytorch params	0.0275	0.0275	0.0871	0.0478	0.1362	0.0543	0.2061	0.0591
GRU4REC-pytorch	OOB	0.0312	0.0312	0.0468	0.0370	0.0537	0.0379	0.0628	0.0386
GRU4REC-pytorch	OOB Correct Eval	0.0315	0.0315	0.0487	0.0380	0.0566	0.0390	0.0663	0.0397
GRU4REC-pytorch	Correct full	0.0222	0.0222	0.0742	0.0397	0.1175	0.0454	0.1806	0.0497

Metric Diff

BPR-Max

Metric difference compared to the “Best params” version with the corresponding loss#
Implementation	Variant	Recall@1 Diff	MRR@1 Diff	Recall@5 Diff	MRR@5 Diff	Recall@10 Diff	MRR@10 Diff	Recall@20 Diff	MRR@20 Diff
GRU4Rec Official	Best params
GRU4Rec Official	GRU4REC-pytorch params	-40.78%	-40.78%	-34.09%	-37.15%	-31.64%	-36.10%	-29.03%	-35.25%
GRU4REC-pytorch	OOB	-78.15%	-78.15%	-71.20%	-74.56%	-67.80%	-73.24%	-63.46%	-71.96%
GRU4REC-pytorch	OOB Correct Eval	-72.16%	-72.16%	-61.23%	-66.35%	-57.01%	-64.60%	-51.65%	-63.00%
GRU4REC-pytorch	Correct full	-67.09%	-67.09%	-57.23%	-61.67%	-53.33%	-60.04%	-49.33%	-58.77%

Cross-Entropy

Metric difference compared to the “Best params” version with the corresponding loss#
Implementation	Variant	Recall@1 Diff	MRR@1 Diff	Recall@5 Diff	MRR@5 Diff	Recall@10 Diff	MRR@10 Diff	Recall@20 Diff	MRR@20 Diff
GRU4Rec Official	Best params
GRU4Rec Official	GRU4REC-pytorch params	-43.74%	-43.74%	-38.54%	-41.18%	-34.70%	-39.78%	-30.08%	-38.46%
GRU4REC-pytorch	OOB	-36.22%	-36.22%	-66.97%	-54.48%	-74.23%	-57.91%	-78.70%	-59.84%
GRU4REC-pytorch	OOB Correct Eval	-35.53%	-35.53%	-65.62%	-53.30%	-72.86%	-56.69%	-77.50%	-58.65%
GRU4REC-pytorch	Correct full	-54.53%	-54.53%	-47.65%	-51.22%	-43.66%	-49.61%	-38.73%	-48.21%

Hyperparameters

BPR-Max

Hyperparameters used in the experiment#
	GRU4Rec Official	GRU4Rec Official	GRU4REC-pytorch	GRU4REC-pytorch	GRU4REC-pytorch
Variant	Best params	GRU4REC-pytorch params	OOB	OOB Correct Eval	Correct full
loss	bpr-max	bpr-max	bpr-max	bpr-max	bpr-max
optim	adagrad	adagrad	adagrad	adagrad	adagrad
constrained_embedding	True	False	False	False	False
embedding	0	512	512	512	512
final_act	elu-1	elu-1	elu-1	elu-1	elu-1
layers	512	512	512	512	512
batch_size	144	144	144	144	144
dropout_p_embed	0.35	0.35	N/A	N/A	0.35
dropout_p_hidden	0	0	N/A	N/A	0
learning_rate	0.05	0.05	0.05	0.05	0.05
momentum	0.4	0	N/A	N/A	N/A
n_sample	2048	0	N/A	N/A	N/A
sample_alpha	0.2	0	N/A	N/A	N/A
bpreg	1.85	0	N/A	N/A	N/A
logq	0	0	N/A	N/A	N/A

Cross-Entropy

Hyperparameters used in the experiment#
	GRU4Rec Official	GRU4Rec Official	GRU4REC-pytorch	GRU4REC-pytorch	GRU4REC-pytorch
Variant	Best params	GRU4REC-pytorch params	OOB	OOB Correct Eval	Correct full
loss	cross-entropy	cross-entropy	cross-entropy	cross-entropy	cross-entropy
optim	adagrad	adagrad	adagrad	adagrad	adagrad
constrained_embedding	True	False	False	False	False
embedding	0	512	512	512	512
final_act	softmax	softmax	softmax	softmax	softmax
layers	512	512	512	512	512
batch_size	32	32	32	32	32
dropout_p_embed	0.4	0.4	N/A	N/A	0.4
dropout_p_hidden	0.15	0.15	N/A	N/A	0.15
learning_rate	0.03	0.03	0.03	0.03	0.03
momentum	0	0	N/A	N/A	N/A
n_sample	2048	0	N/A	N/A	N/A
sample_alpha	0	0	N/A	N/A	N/A
bpreg	0	0	N/A	N/A	N/A
logq	1	0	N/A	N/A	N/A

Runtimes

BPR-Max

Runtime metrics#
Implementation	Variant	Avg. epoch time (s)	Avg. epoch time to Best	Avg. epoch time to Matching	Avg. mb/s	Avg. e/s
GRU4Rec Official	Best params	12.38			704.14	100615.00
GRU4Rec Official	GRU4REC-pytorch params	9.25	0.75 x		941.96	134690.00
GRU4REC-pytorch	OOB	23.35	1.89 x	2.52 x	370.06	53288.62
GRU4REC-pytorch	OOB Correct Eval	23.19	1.87 x	2.51 x	372.64	53660.93
GRU4REC-pytorch	Correct full	31.91	2.58 x	3.45 x	271.16	39046.98

Cross-Entropy

Runtime metrics#
Implementation	Variant	Avg. epoch time (s)	Avg. epoch time to Best	Avg. epoch time to Matching	Avg. mb/s	Avg. e/s
GRU4Rec Official	Best params	37.17			1047.47	33505.00
GRU4Rec Official	GRU4REC-pytorch params	30.51	0.82 x		1276.12	40819.00
GRU4REC-pytorch	OOB	85.06	2.29 x	2.79 x	457.51	14640.34
GRU4REC-pytorch	OOB Correct Eval	86.98	2.34 x	2.85 x	447.82	14330.19
GRU4REC-pytorch	Correct full	89.90	2.42 x	2.95 x	432.89	13852.60

Retailrocket#

Metric

BPR-Max

Metrics#
Implementation	Variant	Recall@1	MRR@1	Recall@5	MRR@5	Recall@10	MRR@10	Recall@20	MRR@20
GRU4Rec Official	Best params	0.1224	0.1224	0.3196	0.1928	0.4181	0.2060	0.5187	0.2131
GRU4Rec Official	GRU4REC-pytorch params	0.0560	0.0560	0.1595	0.0916	0.2294	0.1009	0.3142	0.1067
GRU4REC-pytorch	OOB	0.0096	0.0096	0.0355	0.0184	0.0544	0.0210	0.0794	0.0227
GRU4REC-pytorch	OOB Correct Eval	0.0406	0.0406	0.1341	0.0728	0.1933	0.0807	0.2528	0.0848
GRU4REC-pytorch	Correct full	0.0456	0.0456	0.1443	0.0804	0.1976	0.0875	0.2554	0.0916

Cross-Entropy

Metrics#
Implementation	Variant	Recall@1	MRR@1	Recall@5	MRR@5	Recall@10	MRR@10	Recall@20	MRR@20
GRU4Rec Official	Best params	0.1150	0.1150	0.3026	0.1814	0.4019	0.1947	0.4953	0.2012
GRU4Rec Official	GRU4REC-pytorch params	0.0578	0.0578	0.1681	0.0955	0.2441	0.1056	0.3378	0.1121
GRU4REC-pytorch	OOB	0.0479	0.0479	0.0561	0.0511	0.0590	0.0515	0.0616	0.0517
GRU4REC-pytorch	OOB Correct Eval	0.0510	0.0510	0.0582	0.0539	0.0607	0.0543	0.0642	0.0545
GRU4REC-pytorch	Correct full	0.0492	0.0492	0.1544	0.0858	0.2163	0.0941	0.2812	0.0986

Metric Diff

BPR-Max

Metric difference compared to the “Best params” version with the corresponding loss#
Implementation	Variant	Recall@1 Diff	MRR@1 Diff	Recall@5 Diff	MRR@5 Diff	Recall@10 Diff	MRR@10 Diff	Recall@20 Diff	MRR@20 Diff
GRU4Rec Official	Best params
GRU4Rec Official	GRU4REC-pytorch params	-54.23%	-54.23%	-50.10%	-52.48%	-45.13%	-51.04%	-39.41%	-49.92%
GRU4REC-pytorch	OOB	-92.12%	-92.12%	-88.90%	-90.45%	-86.99%	-89.83%	-84.69%	-89.36%
GRU4REC-pytorch	OOB Correct Eval	-66.88%	-66.88%	-58.06%	-62.23%	-53.78%	-60.84%	-51.27%	-60.18%
GRU4REC-pytorch	Correct full	-62.76%	-62.76%	-54.84%	-58.30%	-52.74%	-57.52%	-50.76%	-57.01%

Cross-Entropy

Metric difference compared to the “Best params” version with the corresponding loss#
Implementation	Variant	Recall@1 Diff	MRR@1 Diff	Recall@5 Diff	MRR@5 Diff	Recall@10 Diff	MRR@10 Diff	Recall@20 Diff	MRR@20 Diff
GRU4Rec Official	Best params
GRU4Rec Official	GRU4REC-pytorch params	-49.71%	-49.71%	-44.43%	-47.36%	-39.25%	-45.78%	-31.79%	-44.32%
GRU4REC-pytorch	OOB	-58.36%	-58.36%	-81.47%	-71.82%	-85.31%	-73.53%	-87.55%	-74.30%
GRU4REC-pytorch	OOB Correct Eval	-55.66%	-55.66%	-80.76%	-70.27%	-84.90%	-72.13%	-87.05%	-72.91%
GRU4REC-pytorch	Correct full	-57.19%	-57.19%	-48.99%	-52.73%	-46.19%	-51.69%	-43.23%	-51.02%

Hyperparameters

BPR-Max

Hyperparameters used in the experiment#
	GRU4Rec Official	GRU4Rec Official	GRU4REC-pytorch	GRU4REC-pytorch	GRU4REC-pytorch
Variant	Best params	GRU4REC-pytorch params	OOB	OOB Correct Eval	Correct full
loss	bpr-max	bpr-max	bpr-max	bpr-max	bpr-max
optim	adagrad	adagrad	adagrad	adagrad	adagrad
constrained_embedding	True	False	False	False	False
embedding	0	224	224	224	224
final_act	elu-0.5	elu-0.5	elu-0.5	elu-0.5	elu-0.5
layers	224	224	224	224	224
batch_size	80	80	80	80	80
dropout_p_embed	0.5	0.5	N/A	N/A	0.5
dropout_p_hidden	0.05	0.05	N/A	N/A	0.05
learning_rate	0.05	0.05	0.05	0.05	0.05
momentum	0.4	0	N/A	N/A	N/A
n_sample	2048	0	N/A	N/A	N/A
sample_alpha	0.4	0	N/A	N/A	N/A
bpreg	1.95	0	N/A	N/A	N/A
logq	0	0	N/A	N/A	N/A

Cross-Entropy

Hyperparameters used in the experiment#
	GRU4Rec Official	GRU4Rec Official	GRU4REC-pytorch	GRU4REC-pytorch	GRU4REC-pytorch
Variant	Best params	GRU4REC-pytorch params	OOB	OOB Correct Eval	Correct full
loss	cross-entropy	cross-entropy	cross-entropy	cross-entropy	cross-entropy
optim	adagrad	adagrad	adagrad	adagrad	adagrad
constrained_embedding	True	False	False	False	False
embedding	0	192	192	192	192
final_act	softmax	softmax	softmax	softmax	softmax
layers	192	192	192	192	192
batch_size	240	240	240	240	240
dropout_p_embed	0.5	0.5	N/A	N/A	0.5
dropout_p_hidden	0.05	0.05	N/A	N/A	0.05
learning_rate	0.085	0.085	0.085	0.085	0.085
momentum	0.3	0	N/A	N/A	N/A
n_sample	2048	0	N/A	N/A	N/A
sample_alpha	0.3	0	N/A	N/A	N/A
bpreg	0	0	N/A	N/A	N/A
logq	1	0	N/A	N/A	N/A

Runtimes

BPR-Max

Runtime metrics#
Implementation	Variant	Avg. epoch time (s)	Avg. epoch time to Best	Avg. epoch time to Matching	Avg. mb/s	Avg. e/s
GRU4Rec Official	Best params	6.86			1019.34	80807.00
GRU4Rec Official	GRU4REC-pytorch params	5.22	0.76 x		1338.98	106326.00
GRU4REC-pytorch	OOB	20.53	2.99 x	3.93 x	337.68	27014.01
GRU4REC-pytorch	OOB Correct Eval	20.44	2.98 x	3.92 x	339.01	27120.93
GRU4REC-pytorch	Correct full	24.42	3.56 x	4.68 x	283.82	22705.56

Cross-Entropy

Runtime metrics#
Implementation	Variant	Avg. epoch time (s)	Avg. epoch time to Best	Avg. epoch time to Matching	Avg. mb/s	Avg. e/s
GRU4Rec Official	Best params	2.77			880.71	199935.00
GRU4Rec Official	GRU4REC-pytorch params	1.89	0.68 x		1279.60	293600.00
GRU4REC-pytorch	OOB	10.10	3.65 x	5.34 x	228.56	54855.21
GRU4REC-pytorch	OOB Correct Eval	10.03	3.62 x	5.31 x	230.11	55225.62
GRU4REC-pytorch	Correct full	9.14	3.30 x	4.84 x	252.47	60592.49

Diginetica#

Metric

BPR-Max

Metrics#
Implementation	Variant	Recall@1	MRR@1	Recall@5	MRR@5	Recall@10	MRR@10	Recall@20	MRR@20
GRU4Rec Official	Best params	0.0688	0.0688	0.2304	0.1237	0.3533	0.1399	0.4995	0.1500
GRU4Rec Official	GRU4REC-pytorch params	0.0365	0.0365	0.1283	0.0675	0.2066	0.0778	0.3141	0.0851
GRU4REC-pytorch	OOB	0.0006	0.0006	0.0020	0.0011	0.0039	0.0013	0.0070	0.0015
GRU4REC-pytorch	OOB Correct Eval	0.0239	0.0239	0.0937	0.0472	0.1537	0.0551	0.2367	0.0608
GRU4REC-pytorch	Correct full	0.0277	0.0277	0.1070	0.0543	0.1747	0.0632	0.2616	0.0692

Cross-Entropy

Metrics#
Implementation	Variant	Recall@1	MRR@1	Recall@5	MRR@5	Recall@10	MRR@10	Recall@20	MRR@20
GRU4Rec Official	Best params	0.0647	0.0647	0.2220	0.1181	0.3414	0.1339	0.4874	0.1440
GRU4Rec Official	GRU4REC-pytorch params	0.0296	0.0296	0.1133	0.0576	0.1888	0.0675	0.2973	0.0749
GRU4REC-pytorch	OOB	0.0287	0.0287	0.0376	0.0321	0.0415	0.0327	0.0457	0.0329
GRU4REC-pytorch	OOB Correct Eval	0.0321	0.0321	0.0415	0.0357	0.0457	0.0363	0.0503	0.0366
GRU4REC-pytorch	Correct full	0.0288	0.0288	0.1135	0.0572	0.1860	0.0667	0.2862	0.0736

Metric Diff

BPR-Max

Metric difference compared to the “Best params” version with the corresponding loss#
Implementation	Variant	Recall@1 Diff	MRR@1 Diff	Recall@5 Diff	MRR@5 Diff	Recall@10 Diff	MRR@10 Diff	Recall@20 Diff	MRR@20 Diff
GRU4Rec Official	Best params
GRU4Rec Official	GRU4REC-pytorch params	-47.02%	-47.02%	-44.31%	-45.41%	-41.53%	-44.39%	-37.13%	-43.22%
GRU4REC-pytorch	OOB	-99.15%	-99.15%	-99.11%	-99.14%	-98.90%	-99.07%	-98.60%	-98.99%
GRU4REC-pytorch	OOB Correct Eval	-65.34%	-65.34%	-59.33%	-61.86%	-56.49%	-60.61%	-52.62%	-59.47%
GRU4REC-pytorch	Correct full	-59.74%	-59.74%	-53.55%	-56.08%	-50.55%	-54.81%	-47.62%	-53.87%

Cross-Entropy

Metric difference compared to the “Best params” version with the corresponding loss#
Implementation	Variant	Recall@1 Diff	MRR@1 Diff	Recall@5 Diff	MRR@5 Diff	Recall@10 Diff	MRR@10 Diff	Recall@20 Diff	MRR@20 Diff
GRU4Rec Official	Best params
GRU4Rec Official	GRU4REC-pytorch params	-54.28%	-54.28%	-48.96%	-51.20%	-44.68%	-49.54%	-39.00%	-47.98%
GRU4REC-pytorch	OOB	-55.59%	-55.59%	-83.08%	-72.79%	-87.86%	-75.61%	-90.62%	-77.13%
GRU4REC-pytorch	OOB Correct Eval	-50.44%	-50.44%	-81.29%	-69.74%	-86.62%	-72.89%	-89.68%	-74.58%
GRU4REC-pytorch	Correct full	-55.50%	-55.50%	-48.88%	-51.59%	-45.51%	-50.15%	-41.28%	-48.90%

Hyperparameters

BPR-Max

Hyperparameters used in the experiment#
	GRU4Rec Official	GRU4Rec Official	GRU4REC-pytorch	GRU4REC-pytorch	GRU4REC-pytorch
Variant	Best params	GRU4REC-pytorch params	OOB	OOB Correct Eval	Correct full
loss	bpr-max	bpr-max	bpr-max	bpr-max	bpr-max
optim	adagrad	adagrad	adagrad	adagrad	adagrad
constrained_embedding	True	False	False	False	False
embedding	0	512	512	512	512
final_act	elu-1	elu-1	elu-1	elu-1	elu-1
layers	512	512	512	512	512
batch_size	128	128	128	128	128
dropout_p_embed	0.5	0.5	N/A	N/A	0.5
dropout_p_hidden	0.3	0.3	N/A	N/A	0.3
learning_rate	0.05	0.05	0.05	0.05	0.05
momentum	0.15	0	N/A	N/A	N/A
n_sample	2048	0	N/A	N/A	N/A
sample_alpha	0.3	0	N/A	N/A	N/A
bpreg	0.9	0	N/A	N/A	N/A
logq	0	0	N/A	N/A	N/A

Cross-Entropy

Hyperparameters used in the experiment#
	GRU4Rec Official	GRU4Rec Official	GRU4REC-pytorch	GRU4REC-pytorch	GRU4REC-pytorch
Variant	Best params	GRU4REC-pytorch params	OOB	OOB Correct Eval	Correct full
loss	cross-entropy	cross-entropy	cross-entropy	cross-entropy	cross-entropy
optim	adagrad	adagrad	adagrad	adagrad	adagrad
constrained_embedding	True	False	False	False	False
embedding	0	192	192	192	192
final_act	softmax	softmax	softmax	softmax	softmax
layers	192	192	192	192	192
batch_size	128	128	128	128	128
dropout_p_embed	0.45	0.45	N/A	N/A	0.45
dropout_p_hidden	0.15	0.15	N/A	N/A	0.15
learning_rate	0.1	0.1	0.1	0.1	0.1
momentum	0	0	N/A	N/A	N/A
n_sample	2048	0	N/A	N/A	N/A
sample_alpha	0	0	N/A	N/A	N/A
bpreg	0	0	N/A	N/A	N/A
logq	1	0	N/A	N/A	N/A

Runtimes

BPR-Max

Runtime metrics#
Implementation	Variant	Avg. epoch time (s)	Avg. epoch time to Best	Avg. epoch time to Matching	Avg. mb/s	Avg. e/s
GRU4Rec Official	Best params	8.02			639.87	81757.00
GRU4Rec Official	GRU4REC-pytorch params	5.29	0.66 x		969.27	123869.00
GRU4REC-pytorch	OOB	32.24	4.02 x	6.09 x	158.86	20333.91
GRU4REC-pytorch	OOB Correct Eval	32.18	4.01 x	6.08 x	159.20	20378.12
GRU4REC-pytorch	Correct full	36.76	4.58 x	6.95 x	139.36	17838.75

Cross-Entropy

Runtime metrics#
Implementation	Variant	Avg. epoch time (s)	Avg. epoch time to Best	Avg. epoch time to Matching	Avg. mb/s	Avg. e/s
GRU4Rec Official	Best params	4.52			1134.52	144959.00
GRU4Rec Official	GRU4REC-pytorch params	3.67	0.81 x		1398.76	178755.00
GRU4REC-pytorch	OOB	17.65	3.90 x	4.81 x	290.27	37154.98
GRU4REC-pytorch	OOB Correct Eval	17.67	3.91 x	4.81 x	289.91	37108.16
GRU4REC-pytorch	Correct full	16.97	3.75 x	4.62 x	301.78	38627.84

GRU4REC-pytorch

Contents

GRU4REC-pytorch#

Rees46#

Yoochoose#

Coveo#

Retailrocket#

Diginetica#