To improve the assembly quality during production, expert systems are often used. These experts typically use a system model as a basis for identifying improvements. However, since a model uses approximate dynamics or imperfect parameters, the expert advice is bound to be biased. This paper presents a reinforcement learning agent that can identify and limit systematic errors of an expert systems used for geometry assurance. By observing the resulting assembly quality over time, and understanding how different decisions affect the quality, the agent learns when and how to override the biased advice from the expert software.