I wonder what would happen if someone evolved a circuit on a large number of FPGAs from different batches. Each of the FPGAs would receive the same input in each iteration but the output function would be biased to expose the worst-behaving units (maybe the bias should be raised biased in later iterations when most units behave well).
Either it would generate a more robust (and likely more recognizable) solution, or it would fail to converge, really.
You may need to train on a smaller number of FPGAs and gradually increase the set. Genetic algorithms have been finicky to get right, and you might find that more devices would massively increase the iteration count