FEMAG: A High Performance Parallel Finite Element Toolbox for Electromagnetic Computations

This paper presents a parallel finite element toolbox for computing large electromagnetic devices on unstructured tetrahedral meshes, FEMAG—Fem for ElectroMagnetics on Adaptive Grids. The finite element toolbox deals with unstructured tetrahedral meshes and can solve electromagnetic eddy current problems in both frequency domain and time domain. It adopts high-order edge element methods and refines the mesh adaptively based on reliable and efficient finite element a posteriori error estimates. We demonstrate the competitive performance of FEMAG by extensive numerical experiments, including TEAM (Testing Electromagnetic Analysis Methods) Problem 21 and the simulation for a single-phase power transformer.


Introduction
Large devices in electric engineering usually have very complicated structures and are made of anisotropic and nonlinear materials. A large power transformer, for example, consists of nonmagnetic plates, magnetic oil tank, grain-oriented steel laminations, complex exciting coils and so on. It is extremely difficult to simulate the whole structure of a large power transformer. Commercial software is usually not competent for such a task because of memory limitation and low scalability for large computers. Moreover, the inefficiency of algebraic system solver also limits their applications to large-scale computing. The purpose of this paper is to propose a parallel finite element toolbox, FEMAG (Fem for ElectroMagnetics on Adaptive Grids), and to demonstrate the competitive performance of our finite element algorithms and the FEMAG. We propose to study the following eddy current problem: (1) where is the electric field, is the magnetic flux, is the magnetic field, and is the current density defined by (2) Here σ is the electric conductivity, J s is the source current density carried by some coils, and c Ω denotes the conducting region. The eddy current problem is a quasi-static approximation of Maxwell's equations at very low frequency by neglecting the displacement currents in Ampere's law (see [1]). The material parameters could be very complicated for electric engineering applications. For example, grain-oriented steel laminations are widely used in iron cores and magnetic shields of large power transformers (see Figure 1 for a TEAM benchmark model) [2]. They are very anisotropic and have multiple scales. The transformer size could be 6 10 times the thickness of coating films over laminations. The magnetic permeability differs a lot in the rolling direction and the other two orthogonal directions. E B = µH H J Furthermore, the energy loss in coils should also be considered for complex exciting source. This necessitates huge number of elements in partitioning the domain. Full three-dimensional (3D) finite element simulation is very difficult due to extensive unknowns. With the evolution of the computer technology, the massively parallel computing with ten thousands of even hundred thousands of CPU cores becomes one of the trends of the scientific computing. The main challenge for simulating large transformers is focused on developing efficient algorithms for solving the discrete problem and scalable parallel finite element codes for large supercomputers.
The second issue concerns low regularity of the solution of Maxwell's equations. It is well-known that the solution of Maxwell's equations may have local singularities at corners and edges of the structure and material interfaces. Numerical methods based on uniform meshes are inefficient in resolving the local singularities. The adaptive finite element method (AFEM) based on the a posteriori error estimates has been successfully and widely applied in many other areas [3][4][5][6], which provides a systematic way to achieve the optimal computational complexity by refining the mesh according to the local a posteriori error estimator on the elements. Unfortunately, parallel implementation of the AFEM on distributed memory parallel computers is very difficult because of the complexities of the mesh management and load balance issues. Also, highly efficient numerical methods for solving the linear system resulting from finite element discretization are required. For facilitating implementing the AFEM, we have developed the toolbox PHG, Parallel Hierarchical Grid [7]. The motivation of this toolbox is to support the research on AFEM algorithms and development of AFEM codes. PHG deals with conforming tetrahedral meshes and uses bisection for adaptive mesh refinement and MPI for message passing. Using the idea of object oriented design, the details of complex mesh management and parallelism are hidden from users. PHG provides supports for adaptive finite element computations, such as finite element bases (including the Nédéléc edge elements for electromagnetic computations) [8], numerical quadrature, and basic operations with finite element functions. For building, assembling, and solving linear systems and eigenvalue problems resulting from finite element discretization, an unified linear algebra module for manipulating distributed sparse matrices stored in compressed sparse rows (CSR) and distributed vectors is provided,. Load balancing is achieved through mesh repartitioning and redistribution.
FEMAG is developed based on PHG. It solves both time-dependent and time-harmonic eddy current problems and can deal with nonlinear and anisotropic materials. For large-scale computing, the scalability of algebraic system solver plays the key role in parallel finite element codes. An efficient solver should possess two properties: (1) its convergence rate should keep quasi-uniform as the number of elements increases; (2) its convergence rate should not degenerate when using massive CPU cores. In FEMAG, the Maxwell solver for the discrete problem is a GMRES algorithm with the auxiliary space preconditioning [9]. And the auxiliary problems are solved by the conjugate gradient method with the Boomer algebraic multigrid preconditioning [10]. The most challenging task is to compute the iron loss and eddy current density in grain-oriented steel laminations. An iron core may consist of hundreds of even more than one thousand of laminations, each of which is only 0.18-0.35mm thick. But the thickness of the coating film over the laminations is only 2-5µm. Thus the lamination stack has multi-scale sizes and the ratio of the largest scale to the smallest scale can amount to 10 6 (see Figure 1). Full 3D finite element modeling is extremely difficult due to extensive unknowns from meshing both laminations and the coating film. There are very few papers on the computation of 3D eddy currents in the literature. In FEMAG, steel laminations are treated by two approaches: 1. Compute 3D eddy current density with an approximate eddy current model that omits the coating film and meshes each lamination respectively [11][12]. This approach yields accurate results but requires a fine mesh for the lamination stack. 2. Compute 3D eddy current density with effective or homogenized conductivity and permeability [13]. This approach is very fast but less accurate in computing iron loss in laminations than the previous one. But it is more favorable in the simulation of the whole of a power transformer. In this paper, we present the numerical results for a power transformer by using homogenized conductivity and permeability.
In this paper, we present a numerical experiment on hp-adaptive finite element method for time-harmonic eddy current problem. The h-adaptive finite element method reduces the error by local mesh refinements. It has been well-studied in the a posteriori error analysis, mesh refinement algorithms, and optimal complexity. Using the idea of error equidistribution, the h-adaptive method based on a posteriori error estimates could yield a quasi-optimal approximation with algebraic convergence rate where η h is the a posteriori error estimate, d is the spatial dimension, p is the polynomial degree, and N h is the number of degrees of freedom. However, due to the singularity of the solution, the quasi-optimality η h will degenerate for higher-order finite elements since the constant C may blow up with increasing p .
The hp-adaptive finite element method reduces the error by both local mesh refinement and local improvement of polynomial degrees. It is more efficient than the pure h-adaptive or p-adaptive methods and could reduce the error exponentially. For example, the optimal convergence rate of the hp-adaptive method is for two dimensional elliptic problems [14], and is also conjectured to be for three dimensional elliptic problems [15], where δ , C are positive constants independent of h and p , hp η is the a posteriori error estimate for the hp-adaptive method, and hp N is the number of degrees of freedom. But for solutions with edge singularities, the meshes leading to the exponential convergence must be obtained by anisotropic refinements, that is, by using "needle elements'' which are parallel to the edges (see [15] for more comments). The implementation of the hp-adaptive method is very challenging for higher dimensional problems. In this paper, we present an hp-type a posteriori error estimate for the time-harmonic eddy current model. Then based on the a posteriori error estimate, an hp-adaptive algorithm is proposed by the strategy of predicted error reduction. We implemented in FEMAG the parallel hp-adaptive method on unstructured tetrahedral meshes. The numerical experiment is performed on TEAM Problem 21 a -2.
The hp-adaptive method shows exponential decay of the a posteriori error estimate and the numerical results agree well with experimental data.
The second numerical experiment is performed on TEAM Problem 21 b -MN which has a magnetic plate and a nonmagnetic plate. It shows that, on 12288 CPU cores, FEMAG has very good scalability for nonlinear eddy current problems. The last numerical experiment is performed on a homemade single-phase transformer. We use effective material parameters for iron core and iron yokes. The relative error of the iron loss is less than 10%.

The A-Formulation of Eddy Current Problems
We start by the -formulation of eddy current model for laminated conductors. Let A be the magnetic vector potential satisfying ∇ × A = B . Then the -formulation of eddy current model reads (cf. e.g. [11]): σ ∂A ∂t where Ω is the truncated domain with boundary Γ , n is the unit outer normal of Γ , and stands for the excited current or source current satisfying ∇ ⋅ J s = 0 . In (1), σ is the conductivity, ν = diag ( H 1 B 1 , H 2  The weak formulation of (1) reads: Find A ∈ H 0 (curl,Ω) such that, for all v ∈ H 0 (curl,Ω) , Let n ℑ be a tetrahedral triangulation of Ω at time n t such that ℑ n Ω c forms a tetrahedral mesh of c Ω . We define the edge element space as follows where is the space of polynomials of order 0 > k . The fully discrete approximation to (4) reads: Given ,

Scalability for Nonlinear Eddy Current Problem
In this subsection, we report the numerical experiment for the TEAM Problem 21 b [2]. The conducting region consists of a magnetic plate and a nonmagnetic plate (see Figure 2). The unit of dimensions in Figure 2 is millimeter. The thickness of the plate is 10mm. The conductivity is 6 10 484 . 6 × Siemens/Metre. The height of each coil is 12mm and the radiuses of the inner arc and the outer arc at four corners are 10mm and 45mm respectively. The vertical distance between them is 24mm. The two coils carry the source currents of 3000 Ampere/Turn in opposite directions. The frequency of source currents is 50 Hertz.  We carry out the computation on the super computer Tianhe-1A, Tianjin, China. We use the second-order edge element method of the second family [8] to solve the problem. The largest number of CPU cores used is 12288 and the finest mesh contains 0.44 billions of degrees of freedom. Table 1 shows that the weak scalability is larger than 70%. It also shows that, with 12288 CPU cores and 0.44 billions of unknowns, the solution time for the algebraic system is only 11 minutes. Table 2 shows the measured iron loss versus the calculated iron loss in the steel plates. They agree very well with each other. Figure 3 shows the experimental values and the numerical values of the magnetic flux density.

Hp-Adaptive Finite Element Method for Time-Harmonic Eddy Current Problem
The second numerical experiment is to test the hp-adaptive finite element method for TEAM Problem 21 a -2 (see [2]). This problem consists of a non-magnetic steel plate with two slits and two racetrack shaped coils (see Figure 3). The steel plate has a conductivity of 6 10 3889 . 1 × Siemens/Metre. The driving current for each coil is 3000 Ampere/Turn and has a frequency of 50 Hz. The driving currents for the two coils are in opposite directions. Since the material parameters are linear, we consider the time-harmonic eddy current problem where i is the imaginary unit, ω is the angular frequency, and µ is the magnetic permeability in the empty space. Let F(ℑ h ) be the set of interior faces on the mesh ℑ h .
For any face F ∈ F(ℑ h ) with F = K 1 ∩ K 2 , we denote the For convenience in notation, we define the residual functions on each K ∈ ℑ h and F ∈ F(ℑ h ) as follows Then the error indicator is defined on each element as follows [16] It is used to refine the mesh adaptively. The global and maximal a posteriori error estimates are defined by FEMAG uses the predicted error decrease strategy (PEDS) to refine the mesh. The algorithm assumes that the solution is locally smooth and the optimal convergence can be obtained by either local h-refinement or local p-refinement. Then on each element K marked for refinement, an error decrease factor K λ is computed to judge which of the h-refinement and p-refinement should be performed. According to [16], K λ can be defined heuristically as follows is the parent element of K . It may happen that K T = if the element is not refined at the previous adaptive iteration. The PDES is presented as follows: Algorithm PDES. Given a tolerance 0 > ε and an initial mesh ℑ 0 and an initial distribution of polynomial degrees 1. Solve for the finite element solution u hp ∈ U(ℑ l , P l ) .

Compute the local error indicator
the global error estimate l η , and the maximal error be the set of elements marked for refinement.
(2). Refine l Î according to the predicted error decrease strategy, namely, (3). For any ; otherwise, refine K using the bisection algorithm [17][18].     Figure 6 shows the reduction rates of the a posteriori error estimate by the h-adaptive method for p = 1, the PERS of the hp-adaptive method, and the Maximum Strategy of the hp-adaptive method used in [16]. We find that the hp-adaptive methods show great superiority over the h-adaptive method in reducing the error. It also shows that both of the algorithms yield exponential decay of the a posteriori error estimate:

Simulation for a Single-Phase Power Transformer
The last numerical experiment is carried out for a single-phase power transformer. The magnetic conductors are the wall of the oil tank, the iron core, and two iron yokes. Moreover, the model also contains nonmagnetic conductors and coils which carry exciting source currents (see Figure 9).
The computations are performed on the cluster LSEC-III of the State Key Laboratory on Scientific and Engineering Computing, Chinese Academy of Sciences.
We partition the computational domain into a tetrahedral mesh with 1,381,600 elements. The polynomial degree of the finite element space is set by 2 = k . The total number of degrees of freedom is 13,147,086. We use 128 CPU cores in the computation. At each time step, the relative residual of the Newton method is reduced to be less than 0.0001. Table 3 shows the measured iron loss versus the calculated iron loss in the conductors. The relative error is about 10%. This demonstrates that FEMAG has the capability to simulate complicated electromagnetic devices. Table 3. Iron loss in the lamination and the magnetic palte (KW).

Calculated iron loss
Measured iron loss 4. 10 4.55 Figure 9. The geometric illustration of a single-phase transformer.

Concluding Remarks
In this paper, we report the parallel finite element toolbox, FEMAG (Fem for ElectroMagnetics on Adaptive Grids), for simulating large electromagnetic devices. Three numerical experiments are proposed to demonstrate the competitive performance of FEMAG, in terms of the scalability of parallel computation, the adaptivity for both mesh refinement and distribution adjustment of polynomial degrees, and the capability to deal with complex electromagnetic devices. All numerical results are verified by experimental data. The FEMAG is efficient for large-scale simulation of electromagnetic problems and has the potential in practical applications.