Benchmarking LLM-Based Synthetic Data Generators for Structural Coherence in Behavioral and Human-Centered Datasets
Scholarship details
| Study levels | Degree |
|---|---|
| Close date | Monday, 22 September 2025 |
| Domestic/international | Domestic Only |
About the scholarship
This project benchmarks large language model (LLM)-based synthetic data generators, such as GReaT and TabulaLLM, with a focus on their ability to preserve structural coherence in behavioural and human-centred datasets. These datasets encompass psychological, educational, and user behaviour data that often include ordinal scales, categorical variables, logical constraints, and complex theory-driven relationships unique to human-centred research. The project will evaluate how effectively current LLM-based models generate synthetic data that maintains these important structural and semantic properties.
Entry requirements
A completed online application must be submitted by 4.30 pm 22 September 2025. Late or incomplete applications will not be accepted. Any required supporting documentation (including references) must also be received by 4.30 pm on the closing date in order for the application to be considered.