Architectural Choice in Inverse Design of THz Metamaterials: Local Priors vs. Global Attention under Data Scarcity
DOI:
https://doi.org/10.63313/AERpc.9062Keywords:
Terahertz metamaterials, Inverse design, Deep learningAbstract
The inverse design of terahertz (THz) electromagnetically induced transparency (EIT) metamaterials is complicated by high-dimensional parameter spaces and complex spectral mappings. While deep learning has emerged as a potent tool for addressing these issues, the suitability of distinct neural architectures under data-constrained conditions remains underexplored. Focusing on EIT spectra characterized by long-range frequency dependencies, this study compares the inverse design performance of the Fully Connected Residual Network (FC-ResNet), which leverages local inductive biases, against the Transformer architecture, which relies on global self-attention mechanisms. We evaluated these models using a comprehensive dataset of 20,476 samples and a restricted dataset of only 500 samples. The results demonstrate that in data-rich environments, both architectures achieve exceptional accuracy (R² > 0.99), indicating that architectural differences do not constitute a bottleneck when data is abundant. However, in small-sample regimes—simulating scenarios with scarce experimental data—the FC-ResNet exhibits significantly superior generalization capabilities compared to the Transformer. Our findings suggest that while Transformers offer global modeling potential, they are prone to overfitting when data is limited; conversely, the structural priors of the FC-ResNet provide essential regularization in such contexts. This work offers empirical guidance for selecting architectures in the intelligent design of metamaterials where data availability is a limiting factor
References
[1] Nagatsuma T, Ducournau G, Renaud C C. Advances in terahertz communications acceler-ated by photonics[J]. Nature Photonics, 2016, 10(6): 371-379.
[2] Mittleman D M. Twenty years of terahertz imaging[J]. Optics express, 2018, 26(8): 9417-9431.
[3] O’Hara J F, Singh R, Brener I, et al. Thin-film sensing with planar terahertz metamaterials: sensitivity and limitations[J]. Optics express, 2008, 16(3): 1786-1795.
[4] Yu N, Capasso F. Flat optics with designer metasurfaces[J]. Nature materials, 2014, 13(2): 139-150.
[5] Chen H T, Padilla W J, Zide J M O, et al. Active terahertz metamaterial devices[J]. Nature, 2006, 444(7119): 597-600.
[6] Papasimakis N, Fedotov V A, Zheludev N I, et al. Metamaterial analog of electromagneti-cally induced transparency[J]. Physical Review Letters, 2008, 101(25): 253903.
[7] Zhang S, Genov D A, Wang Y, et al. Plasmon-induced transparency in metamaterials[J]. Physical review letters, 2008, 101(4): 047401.
[8] Liu N, Weiss T, Mesch M, et al. Planar metamaterial analogue of electromagnetically in-duced transparency for plasmonic sensing[J]. Nano letters, 2010, 10(4): 1103-1107.
[9] Gu J, Singh R, Liu X, et al. Active control of electromagnetically induced transparency ana-logue in terahertz metamaterials[J]. Nature communications, 2012, 3(1): 1151.
[10] Molesky S, Lin Z, Piggott A Y, et al. Inverse design in nanophotonics[J]. Nature Photonics, 2018, 12(11): 659-670.
[11] Haupt R L. An introduction to genetic algorithms for electromagnetics[J]. IEEE Antennas and Propagation Magazine, 1995, 37(2): 7-15.
[12] Robinson J, Rahmat-Samii Y. Particle swarm optimization in electromagnetics[J]. IEEE transactions on antennas and propagation, 2004, 52(2): 397-407.
[13] Peurifoy J, Shen Y, Jing L, et al. Nanophotonic particle simulation and inverse design using artificial neural networks[J]. Science advances, 2018, 4(6): eaar4206.
[14] Liu D, Tan Y, Khoram E, et al. Training deep neural networks for the inverse design of nanophotonic structures[J]. Acs Photonics, 2018, 5(4): 1365-1369.
[15] Ma W, Liu Z, Kudyshev Z A, et al. Deep learning for the design of photonic structures[J]. Nature photonics, 2021, 15(2): 77-90.
[16] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[17] Asano T, Noda S. Optimization of photonic crystal nanocavities based on deep learning[J]. Optics express, 2018, 26(25): 32704-32717.
[18] Wiecha P R, Muskens O L. Deep learning meets nanophotonics: a generalized accurate pre-dictor for near fields and far fields of arbitrary 3D nanostructures[J]. Nano letters, 2019, 20(1): 329-338.
[19] Malkiel I, Mrejen M, Nagler A, et al. Plasmonic nanostructure design and characterization via deep learning[J]. Light: Science & Applications, 2018, 7(1): 60.
[20] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in neural in-formation processing systems, 2017, 30.
[21] Dosovitskiy A. An image is worth 16x16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv:2010.11929, 2020.
[22] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). 2019: 4171-4186.
[23] Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with Al-phaFold[J]. nature, 2021, 596(7873): 583-589.
[24] Karniadakis G E, Kevrekidis I G, Lu L, et al. Physics-informed machine learning[J]. Nature Reviews Physics, 2021, 3(6): 422-440.
[25] Zhang Y, Ling C. A strategy to apply machine learning to small datasets in materials sci-ence[J]. Npj Computational Materials, 2018, 4(1): 25.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 by author(s) and Erytis Publishing Limited.

This work is licensed under a Creative Commons Attribution 4.0 International License.








