Graphusion – Leveraging Large Language Models for the Construction and Operationalization of Knowledge Graphs.

Background and Motivation

Knowledge graphs (KGs) represent structured information enabling complex reasoning across heterogeneous sources, but their construction from unstructured data remains labor-intensive. Recent advances in large language models (LLMs) demonstrate strong capabilities in information extraction and semantic understanding, yet converting LLM output into scalable, dynamically updated knowledge graphs while ensuring reliability remains an open challenge.

Research Questions

How can LLMs automatically extract entities, relations, and temporal links from multimodal evidence for robust KG construction?
Which LLM strategies (prompt engineering, fine-tuning, RAG) optimize the accuracy-scalability trade-off?
How do graph-based methods enhance actionable insights from LLM-generated knowledge graphs?
What evaluation protocols ensure scientific validity and robustness to noise and conflicting information?

Objectives and Contributions

Core System Development: Design an automated pipeline integrating LLMs for entity/relation extraction with incremental KG updates, connected to interactive timeline and geographic visualizations.

Advanced Analytics: Apply graph theory metrics and graph neural networks to derive higher-level insights from the constructed knowledge graphs.

Comprehensive Evaluation: Benchmark multiple LLMs (GPT-4, LLaMA-3, Jais) across zero-shot, few-shot, and fine-tuned approaches, measuring precision/recall, temporal accuracy, hallucination rates, and cost-performance trade-offs.

Methodology and Initial Timeline

Phase 1 (Months 1-2): Literature review and baseline experiments with LLM-based information extraction on curated multimodal datasets.

Phase 2 (Months 3-4): Core system implementation including graph database integration (Neo4j/RDF), incremental update mechanisms, and visual components linking entities to temporal/geographic contexts.

Phase 3 (Month 5): Advanced graph analytics implementation, comprehensive LLM comparison study, and robustness testing with synthetic and real-world noisy data.

Phase 4 (Month 6): Evaluation across intrinsic metrics (extraction accuracy, temporal consistency) and extrinsic measures (downstream task performance), with human expert validation studies.

Expected Impact

The resulting system will demonstrate practical applications in crisis monitoring, supply chain analysis, or academic literature mapping while contributing methodological advances to both LLM and knowledge graph communities. The modular design ensures reproducibility and extensibility for future research.