Architectural Choice in Inverse Design of THz Metamaterials: Local Priors vs. Global Attention under Data Scarcity

Peng Xu; Shuang Wang

doi:10.63313/AERpc.9062

Authors

Peng Xu College of Engineering, Tianjin University of Technology and Education, No. 1310, Dagu South Road, Hexi District, Tianjin 300222, China Author
Shuang Wang College of Engineering, Tianjin University of Technology and Education, No. 1310, Dagu South Road, Hexi District, Tianjin 300222, China Author

DOI:

https://doi.org/10.63313/AERpc.9062

Keywords:

Terahertz metamaterials, Inverse design, Deep learning

Abstract

The inverse design of terahertz (THz) electromagnetically induced transparency (EIT) metamaterials is complicated by high-dimensional parameter spaces and complex spectral mappings. While deep learning has emerged as a potent tool for addressing these issues, the suitability of distinct neural architectures under data-constrained conditions remains underexplored. Focusing on EIT spectra characterized by long-range frequency dependencies, this study compares the inverse design performance of the Fully Connected Residual Network (FC-ResNet), which leverages local inductive biases, against the Transformer architecture, which relies on global self-attention mechanisms. We evaluated these models using a comprehensive dataset of 20,476 samples and a restricted dataset of only 500 samples. The results demonstrate that in data-rich environments, both architectures achieve exceptional accuracy (R² > 0.99), indicating that architectural differences do not constitute a bottleneck when data is abundant. However, in small-sample regimes—simulating scenarios with scarce experimental data—the FC-ResNet exhibits significantly superior generalization capabilities compared to the Transformer. Our findings suggest that while Transformers offer global modeling potential, they are prone to overfitting when data is limited; conversely, the structural priors of the FC-ResNet provide essential regularization in such contexts. This work offers empirical guidance for selecting architectures in the intelligent design of metamaterials where data availability is a limiting factor

References

[1] Nagatsuma T, Ducournau G, Renaud C C. Advances in terahertz communications acceler-ated by photonics[J]. Nature Photonics, 2016, 10(6): 371-379.

[2] Mittleman D M. Twenty years of terahertz imaging[J]. Optics express, 2018, 26(8): 9417-9431.

[3] O’Hara J F, Singh R, Brener I, et al. Thin-film sensing with planar terahertz metamaterials: sensitivity and limitations[J]. Optics express, 2008, 16(3): 1786-1795.

[4] Yu N, Capasso F. Flat optics with designer metasurfaces[J]. Nature materials, 2014, 13(2): 139-150.

[5] Chen H T, Padilla W J, Zide J M O, et al. Active terahertz metamaterial devices[J]. Nature, 2006, 444(7119): 597-600.

[6] Papasimakis N, Fedotov V A, Zheludev N I, et al. Metamaterial analog of electromagneti-cally induced transparency[J]. Physical Review Letters, 2008, 101(25): 253903.

[7] Zhang S, Genov D A, Wang Y, et al. Plasmon-induced transparency in metamaterials[J]. Physical review letters, 2008, 101(4): 047401.

[8] Liu N, Weiss T, Mesch M, et al. Planar metamaterial analogue of electromagnetically in-duced transparency for plasmonic sensing[J]. Nano letters, 2010, 10(4): 1103-1107.

[9] Gu J, Singh R, Liu X, et al. Active control of electromagnetically induced transparency ana-logue in terahertz metamaterials[J]. Nature communications, 2012, 3(1): 1151.

[10] Molesky S, Lin Z, Piggott A Y, et al. Inverse design in nanophotonics[J]. Nature Photonics, 2018, 12(11): 659-670.

[11] Haupt R L. An introduction to genetic algorithms for electromagnetics[J]. IEEE Antennas and Propagation Magazine, 1995, 37(2): 7-15.

[12] Robinson J, Rahmat-Samii Y. Particle swarm optimization in electromagnetics[J]. IEEE transactions on antennas and propagation, 2004, 52(2): 397-407.

[13] Peurifoy J, Shen Y, Jing L, et al. Nanophotonic particle simulation and inverse design using artificial neural networks[J]. Science advances, 2018, 4(6): eaar4206.

[14] Liu D, Tan Y, Khoram E, et al. Training deep neural networks for the inverse design of nanophotonic structures[J]. Acs Photonics, 2018, 5(4): 1365-1369.

[15] Ma W, Liu Z, Kudyshev Z A, et al. Deep learning for the design of photonic structures[J]. Nature photonics, 2021, 15(2): 77-90.

[16] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.

[17] Asano T, Noda S. Optimization of photonic crystal nanocavities based on deep learning[J]. Optics express, 2018, 26(25): 32704-32717.

[18] Wiecha P R, Muskens O L. Deep learning meets nanophotonics: a generalized accurate pre-dictor for near fields and far fields of arbitrary 3D nanostructures[J]. Nano letters, 2019, 20(1): 329-338.

[19] Malkiel I, Mrejen M, Nagler A, et al. Plasmonic nanostructure design and characterization via deep learning[J]. Light: Science & Applications, 2018, 7(1): 60.

[20] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[J]. Advances in neural in-formation processing systems, 2017, 30.

[21] Dosovitskiy A. An image is worth 16x16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv:2010.11929, 2020.

[22] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). 2019: 4171-4186.

[23] Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with Al-phaFold[J]. nature, 2021, 596(7873): 583-589.

[24] Karniadakis G E, Kevrekidis I G, Lu L, et al. Physics-informed machine learning[J]. Nature Reviews Physics, 2021, 3(6): 422-440.

[25] Zhang Y, Ling C. A strategy to apply machine learning to small datasets in materials sci-ence[J]. Npj Computational Materials, 2018, 4(1): 25.