Dynamic-KGQA
A Dynamic Knowledge Graph Question Answering Framework
About Dynamic-KGQA
In this work, we introduce Dynamic-KGQA, a scalable framework for generating adaptive QA datasets from knowledge graphs (KGs), designed to mitigate memorization risks while maintaining statistical consistency across iterations. Unlike fixed benchmarks, Dynamic-KGQA generates a new dataset variant on every run while preserving the underlying distribution, enabling fair and reproducible evaluations.
Furthermore, our framework provides fine-grained control over dataset characteristics, supporting domain-specific and topic-focused QA dataset generation.
Additionally, Dynamic-KGQA produces compact, semantically coherent subgraphs that facilitate both training and evaluation of KGQA models, enhancing their ability to leverage structured knowledge effectively.
Citation
If you use the Dynamic-KGQA dataset or codebase in your work, please cite the following paper:
@article{2025dynamickgqa,
title={Dynamic-KGQA: A Dynamic Knowledge Graph Question Answering Framework},
author={Preetam Prabhu Srikar Dammu and Himanshu Naidu and Chirag Shah},
journal={arXiv preprint arXiv:2109.03893},
year={2025}
}
You can also cite the dataset directly using the following DOI:
Knowledge Graph Hosting
While it is possible to use any knowledge graph with the Dynamic-KGQA framework, we built our dataset using the Yago knowledge graph for generating QA pairs. The Yago knowledge graph is available for download from the Yago website.
There are multiple ways to host the Yago knowledge graph for use with the Dynamic-KGQA framework. The instructions for hosting the Yago knowledge graph using a Docker Blazegraph container are provided in our GitHub repository.
Dataset Format
The public dataset created by the DynamicKGQA framework is available in the Hugging Face Datasets format. You can load the dataset using the following code snippet:
from datasets import load_dataset
dataset = load_dataset("preetam7/dynamic_kgqa")
The dataset contains the following columns:
-
idThe unique identifier for the QA pair. -
questionThe input question text. -
answerThe answer text (typically a Yago entity). -
answer_readableThe human-readable answer text. -
answer_uriThe Yago URI of the answer entity. -
supporting_factsThe supporting Yago triples for the answer in the form of knowledge graph triplet labels. -
supporting_facts_uriThe URIs of the supporting Yago triples for the answer. -
subgraphThe subgraph used to generate the QA pair. -
subgraph_sizeThe size of the subgraph used to generate the QA pair. -
logical_structure_flag_nFlag by the nth LLM-as-judge, indicating if the QA pair has a logical structure. -
logical_structure_reasoning_nExplanation of the logical structure flag by the nth LLM-as-judge. -
redundancy_flag_nFlag by the nth LLM-as-judge, indicating if the QA pair is trivial or the question includes the answer. -
redundancy_reasoning_nExplanation of the redundancy flag by the nth LLM-as-judge. -
answer_support_flag_nFlag by the nth LLM-as-judge, indicating if the supporting Yago triples substantiate the answer. -
answer_support_reasoning_nExplanation of the answer support flag by the nth LLM-as-judge. -
answer_adequacy_flag_nFlag by the nth LLM-as-judge, indicating if the answer is adequate for the question. -
answer_adequacy_reasoning_nExplanation of the answer adequacy flag by the nth LLM-as-judge.