|
HiCLAS1 Technical Reports
HPC-2009-4: AERMOD-HPCS for Linux™ (Part 2) Copyright © 2009 HiCLAS1 NUMERICAL ANALYSIS OF AERMOD-HPCS (Build 2) ON Linux™ PLATFORMS George Delic and Arnold R. Srackangast
1. INTRODUCTION This is a numerical analysis report for commodity platforms when applied to the Air Quality Model (AQM) AERMOD on 32-bit and 64-bit Linux™ operating systems. New results are presented for the numerical differences observed with AERMOD in two versions: a local compilation of the source code released by the U.S. EPA (hereafter AERMOD-EPA) and the High Performance Computing (HPC) version developed by HiCLAS1 (AERMOD-HPCS) released as Build 2 of v1.8 (hereafter v1.8.2) - the first Linux release. Both versions are designed to execute the U.S. EPA's regulatory AERMOD model on a single processor CPU (or core) - no parallel version is studied in this report. This report is for the commodity processors with the Linux™ platforms listed in Table 1 of report HPC-2009-3. Previous reports presented results of AERMOD-HPCS on Windows™ operating systems for 32-bit platforms (HPC-2009-1, HPC-2009-2). The purpose of this report is to display numerical differences observed for these versions of AERMOD on commodity architectures with Linux™ operating systems to address the requirement of a Model Equivalence Demonstration (MED). 2.0 CHOICE OF HARDWARE AND OPERATING SYSTEM The hardware used for the results reported here includes three Intel® (Intel) and one Advanced Micro Devices (AMD) processor. The previous report identifies the platforms used in this analysis together with their attributes and for further information visit the Web addresses given in the References Section. In this survey older and newer generations of CPUs are included: from Itanium™ systems to quad core servers. The goal was to survey a variety of processors suitable for AERMOD simulations on Linux™ platforms. As a consequence of the scope of this study, extensive tabulations of results are not reproduced here but are available as downloadable PDF files. The following tables show only global information or totals to facilitate the comparisons made in the following discussion. 3.0 CHOICE OF COMPILERS The source code for AERMOD is distributed by the U.S. EPA at http://www.epa.gov/scram001 and was compiled for all the results designated here as AERMOD-EPA. Other results designated here as AERMOD-HPCS were obtained from a compilation of AERMOD-HPCS source code that was modified away from the U.S. EPA source distribution available at the above named U.S. EPA SCRAM Web portal. The compiler used for AERMOD-HPCS in this analysis (and distribution) is un-named but has been chosen after testing of the most popular compilers currently available. Considerable effort has been invested in exhaustively testing multiple compiler options to enable the best performance consistent with the code structure changes employed at HiCLAS1. 4.0 CHOICE OF BENCHMARKS The AERMOD model describes pollutant dispersion and deposition and is now an approved regulatory model for new source reviews and other permitting applications. It is available in either a source code or executable model version at the U.S. EPA’s Support Center for Regulatory Air Models at the URL portal named above. However, only as a Windows™ executable application, without a corresponding Linux™ release. The version used here is AERMOD 07026 and is designated as AERMOD-EPA to designate compilation for U. S. EPA source. To create the High Performance Computing (HPC) version of AERMOD the source code for the U.S. EPA distribution was progressively modified to enhance performance. The resulting code is designated AERMOD-HPCS, and at v1.8 (the current release is Build 2) it was deemed to be a sufficient improvement over AERMOD-EPA to warrant exhaustive Quality Assurance (QA) for the purposes of a Model Equivalence Demonstration. For QA testing the four Cases listed in Table 2 of HPC-2007-1 were used as benchmarks. These benchmarks are considered to be representative of actual applications for AERMOD and input and output files for Case 2 are included in the distribution for the purpose of testing the installation after download of the AERMOD-HPCS executable model. 5.0 BENCHMARK RESULTS 5.1 Definition of nomenclature Individual concentration results produced by AERMOD-HPCS in both Build 1 (previous release) and Build 2 (current release) were compared against concentration results from a local compilation of the unmodified U.S. EPA source code. For each platform identified in the previous report (HPC-2009-3) there is only a single comparison group, B, and Table 1 summarizes the notation used here for five Linux™ platforms.
A comparison was made for a total of 14,366 individual concentration values for all four cases used in the benchmarks and absolute and relative errors tabulated. When the absolute error of the comparison exceeded a tolerance of 2.0e-05 a counter was incremented. This choice of tolerance is dictated by the use of single precision and the precision used in numerous constants throughout the AERMOD source code (as distributed by the U.S. EPA). A higher tolerance criterion is not warranted in our judgement. In each group a global maximum absolute error and corresponding relative error was tabulated separately and these results are summarized below. Full details may be found in the 5 Tables for all platforms as downloadable PDF files. These tables use the nomenclature defined in Table 1 above . 5.2 Global results For each platform and group Table 2 below summarizes global results of the individual comparisons. Only group B is shown corresponding to the comparison of AERMOD-HPCSv1.8 Build 2 against the local compilation of the U.S. EPA source code. Tables B-11 and B-D3 are for a 32-bit version of AERMOD-HPCSv1.8 executed on a 32-bit and 64-bit Linux™ operating system, respectively. Tables B-14, B-17, and B-100 are for a 64-bit Build executed on 64-bit Linux™ operating systems. All Intel platforms do show maximum absolute errors in the range 2.0e-05 to 5.4e-02. However, for the AMD platform this increases to 9.3e-02. Also, the number of times this comparison exceeds the tolerance of 2.0e-05 jumps to 917 (B-D3) out of 14366 values compared.
5.3 Analysis of results Numerical results should all show a maximum absolute error of the order of: 2.0E-05 to 5.0E-05 (a tolerance determined by the precision of constants used in the code). This is the observed result for the Intel Pentium 4 Xeon (B-11) and Itanium (B-100) platforms. However, important differences are observed when comparing the results for B-17 and B-100: the maximum absolute error is three orders of magnitude larger on the x86_64 platforms (Intel 64EMT or the Intel Quad core) compared to the IA64 (Itanium processor). On both types of platform the same compiler options were used for the same codes in 64-bit versions of the AERMOD-HPCSv1.8 Build 2. Furthermore, the results on the AMD platform (B-D3) show larger differences. However, it should be noted that in no case does the maximum relative error reach the 2% limit required by the MED. AERMOD-HPCSv1.8 Build 2 used a newer version of the compiler with new optimization options to enhance portable performance and, as a result, a reduction in portability of precision is evident when this model is moved across hardware from different vendors. When migrating any model across architectures it is important to attain both portable performance and portable precision. Therefore an important goal is to understand what causes this issue so that performance and numerical stability is attained on the new multi-core architectures. 6.0 CONCLUSIONS This performance analysis of AERMOD-HPCS on Linux™ platforms shows that it delivers a solution with small numerical differences when compared with the results for a local compilation of the U.S. EPA's distribution of the AERMOD source code. However, comparison of numerical differences for 32-bit versus 64-bit operating systems indicate variability in precision. Furthermore, while performance optimizations may give portable performance enhancements in Build 2 of AERMOD-HPCSv1.8, these may be at the price of portability of numerical precision. The numerical results of 32-bit and 64-bit Linux™ builds of AERMOD-HPCS are still well within the accuracy tolerance set by the Model Equivalence Demonstration requirements. Future reports will detail resolutions to issues of numerical precision portability. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||