Network slicing is one of the major catalysts to turn future telecommunication networks into versatile service platforms. Along with its benefits, network slicing is introducing new challenges in the development of sustainable network operations. In fact, guaranteeing slices requirements comes at the cost of additional energy consumption, in comparison to non-sliced networks. Yet, one of the main goals of operators is to offer the diverse 5G and beyond services, while ensuring energy efficiency. To this end, we study the problem of slice activation/deactivation, with the objective of minimizing energy consumption and maximizing the users quality of service (QoS). To solve the problem, we rely on two Multi-Armed Bandit (MAB) agents to derive decisions at individual base stations. Our evaluations are conducted using a real-world traffic dataset collected over an operational network in a medium size French city. Numerical results reveal that our proposed solutions provide approximately 11-14% energy efficiency improvement compared to a configuration where all the slice instances are active, while maintaining the same level of QoS. Moreover, our work explicitly shows the impact of prioritizing the energy over QoS, and vice versa.