In a multi-user wireless network under jamming attacks, self-interested users trying to learn their best anti-jamming strategy encounter a major problem, which is interference. An interesting solution is for users to learn anti-jamming techniques cooperatively, which can be achieved using a distributed learning algorithm. In this context, works proposing distributed learning algorithms to overcome jamming in multi-user applications, rely on the availability of a safe communication link to exchange information between users. However, if this communication link wireless, assuming that this link is safe against jamming attacks would not be accurate. Consequently, we propose a novel distributed multi-agent reinforcement learning algorithm for anti-jamming, namely Cross-Check Q-learning, where users build estimates of each other’s decision-making policies and adapt to them, therefore eliminating their need to communicate. When applied against both sweeping and smart jammers, our algorithm provides users with a better understanding of their environment and helps them learn the attacker’s policy and effectively avoid mutual interference. The proposed method’s transmission rates and interference levels is compared with the standard Q-learning, the collaborative multi-agent algorithm, and random policy. Simulation results show that our algorithm improves the users’ transmission rates, eliminates mutual interference, and has the highest convergence speed under the two considered jamming attacks.