Zero-shot Fact Verification by Claim Generation


Neural models for automated fact verification have achieved promising results thanks to the availability of large, human-annotated datasets. However, for each new domain that requires fact verification, creating a dataset by manually writing claims and linking them to their supporting evidence is expensive. We develop QACG, a framework for training a robust fact verification model by using automatically-generated claims that can be supported, refuted, or not verifiable from evidence from Wikipedia. QACG generates question-answer pairs from the evidence and then convert them into different types of claims. Experiments on the FEVER dataset show QACG framework significantly reduces the demand for human-annotated training data. On a zero-shot scenario, QACG improves a RoBERTa model’s F1 from 50% to 77%, equivalent performance to over 2K manually-curated examples.