Benchmarking LLM Sensitivity to Prompt Formats: A Contamination-Free Approach

Fumio Miyata

doi:10.5281/zenodo.20192002

Benchmarking LLM Sensitivity to Prompt Formats: A Contamination-Free Approach

2026-05-15 by Fumio Miyata

Author: Fumio Miyata https://orcid.org/0009-0008-8797-5578
Affiliation: Noetics Institute / noetics.institute
Published at: Zenodo Preprint
DOI: 10.5281/zenodo.20192002
Original record: https://doi.org/10.5281/zenodo.20192002

Abstract

he evaluation of Large Language Models (LLMs) is complicated by prompt sensitivity and data contamination, obscuring the distinction between genuine reasoning and rote memorization. This paper introduces a reproducible, contamination-free benchmark to measure how LLM responses vary with the prompt’s language, style, and syntactic format. Our methodology uses the constructed language Lojban—virtually absent from pre-training corpora—and a suite of novel symbolic prompting tasks to assess a model’s ability to interpret unfamiliar formal systems. The results indicate three key findings: (1) prompt strictness can elicit latent capabilities, but its effectiveness is limited to familiar languages; (2) models exhibit a significant ceiling in algorithmic complexity, failing to produce bug-free code for novel tasks; and (3) performance appears more indicative of sophisticated pattern matching than abstract reasoning. This work provides a comprehensive dataset and a rigorous framework for evaluating the generalization and true reasoning abilities of LLMs. The accompanying code and data are archived on Zenodo (DOI: https://doi.org/10.5281/zenodo.18043860).

Download

PDF (recommended for Google Scholar indexing)
- http://noetics.institute/wp-content/uploads/2026/05/prompt_format_benchmark_en.pdf
- Zenodo version:
  https://zenodo.org/records/20192002/files/prompt_format_benchmark_en.pdf?download=1

Citation

English citation:

Fumio Miyata, “Benchmarking LLM Sensitivity to Prompt Formats: A Contamination-Free Approach”, Zenodo, 2026.
DOI: 10.5281/zenodo.20192002.

Also available on ResearchGate

https://www.researchgate.net/publication/404884143_Benchmarking_LLM_Sensitivity_to_Prompt_Formats_A_Contamination-Free_Approach
This paper is also listed on ResearchGate for academic visibility and researcher discovery.

BibTeX

@article{miyata_bls_2026,
  title   = {Benchmarking LLM Sensitivity to Prompt Formats: A Contamination-Free Approach},
  author  = {Fumio Miyata},
  year    = {2026},
  doi     = {10.5281/zenodo.20192002},
  url     = {https://doi.org/10.5281/zenodo.20192002},
  journal = {Zenodo Preprint}
}

Code & Data

All experimental resources and implementations are openly available on GitHub:

https://github.com/aikenkyu001/benchmarking_llm_against_prompt_formats/tree/main

Post Views: 155