Best results for intermediate $N_{samples}$:
Upper bounds on intrinsic VOI $\Lambda^b_i$ of testing the $i$th arm N times:
$$\Lambda_\alpha^b < \frac {N\overline X_\beta^{n_\beta}}{n_\alpha+1}\cdot 2\exp\left(- 1.37(\overline X_\alpha^{n_\alpha}\hspace{-0.5em} - \overline X_\beta^{n_\beta})^2 n_\alpha\right)$$ $$\Lambda_{i|i\ne\alpha}^b < \frac {N(1-\overline X_\alpha^{n_\alpha})} {n_i+1}\cdot 2\exp\left(- 1.37(\overline X_\alpha^{n_\alpha}\hspace{-0.5em} - \overline X_i^{n_i})^2 n_i\right)$$MCTS re-uses rollouts generated at earlier search states.
The cost of a sample is the VOI of increasing a future budget by one sample.