In this paper, the advantage of reinforcement learning to develop a new traffic shaper is invoked in order to obtain a reasonable utilization of bandwidth while preventing traffic overload in other parts of the network. This leads to a reduction in the total number of packet droppings in the whole network. The method is implemented in a novel proposed intelligent simulation environment. Keeping dropping probability low while injecting as many packets as possible into the network, in order to utilize the available bandwidth, shows satisfactory behavior in simulation environment. On the other hand, the results show that the system can perform well even in situations that have not been previously introduced to the system.