Parallel Thomas Approach in Computational Fluid Dynamics with GPUs- Lid-driven Cavity

Document Type : مقاله کوتاه

Authors

1 Tehran University

2 Shahrood University

Abstract

In this paper three algorithms of Cyclic-Reduction, Parallel-Cyclic-Reduction and Parallel-Thomas are introduced to solve the Tridiagonal system of equations using GPUs and the effect of coalesced-memory-access and uncoalesced-memory-access to global memory are studied. To assess the ability of these algorithms, as a case-study the simulation of the lid-driven cavity flow have been compared to the results of Runtimes and physical parameters of the classical Thomas algorithm, executed on CPU. The maximum speed-up of these algorithms against CPU runtime is about 4.4x, 5.2x and 38.5x, respectively. Also, approximately a 2x speed-up achieved in coalesced-memory access on GPU.

Keywords


1. Accessed 30 November, 2014; http://www.top500.org/.
2. Zhang, Y. Owns, J.D. and Cohen, J., "Fast Tri-diagonal Solvers on the GPU", Proceeding of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Bangalore, January 09-14, (2010).
3. Göddeke, D. and Strzodka, R., "Cyclic Reduction Tri-diagonal Solvers on GPUs Applied to Mixed-Precision Multigrid", IEEE Transactions on Parallel and Distributed Systems, Vol. 22, No. 1, (2011).
4. Davidson, A., Zhang, Y. and Owens, J.D., "An Auto-tuned Method for Solving Large Tri-diagonal Systems on the GPU", in Proceedings of the 25th IEEE International Parallel and Distributed Processing Symposium, Anchorage, Alaska, (2011).
5. Egloff, D., "High performance finite difference PDE solvers on GPUs", Quant Alea GmbH, Technical report, February (2010).
6. Kim, H.S., Chang, L.W., Wu, S. and Hwu, W.W., "A Scalable Tri-diagonal Solver for GPUs", International Conference on parallel processing, Taipei, pp. 444-453, 13-16 Sept. (2011).
7. Tutkun, B. and Edis, F.O., "A GPU application for high-order compact finite difference scheme", Computers & Fluids, Vol. 55, No. 11, pp. 29-35, (2012).
8. Esfahanian, V., Darian, H.M. and Gohari, S.M.I. "Assessment of WENO schemes for numerical simulation of some hyperbolic equations using GPU", Computers & Fluids, Vol. 80, pp. 260-268, (2012).
9. Esfahanian, V., Baghapour, B., Torabzadeh, M. and Chizari, H., "An effcient GPU implementation of cyclic reduction solver for high-order compressible viscous flow simulations", Computers & Fluids, Vol. 92, pp. 160-171, (2014).
10. Darian, H.M. and Esfahanian, V. "Assessment of WENO schemes for multi-dimensional Euler equations using GPU", Numerical Methods in Fluids, Vol. 76, pp. 961-981, (2014).
11. Ghia, U., Ghia, K.N. and Shin, C.T., "High-Re solutions for incompressible flow using the Navier Stokes equations and a multigrid method", Journal of Computational Physics, Vol. 48, No. 3, pp. 387-411, (1982).
12. Bicudo, P. and Cardoso, N., "Time dependent simulation of the Driven Lid Cavity at High Reynolds Number", Physics of Fluid Dynamics, Vol. 1, pp. 1−20 (2009).
13. Erturk, E., "Discussions on Driven Cavity Flow", Numerical Methods in Fluids, Vol. 60, pp.
275-294, 2009.
14. Poochinapan, K., "Numerical Implementations for 2D Lid-Driven Cavity Flow in Stream Function Formulation", ISRN Applied Mathematics, Vol. 2012, Article ID 871538, pp. 1-17, (2012).
15. Tiana, Z.F. and Geb, Y.B., "A Fourth-Order Compact ADI Method for Solving Two-Dimensional Unsteady Convection–Diffusion Problems", Journal of Computational and Applied Mathematics, Vol. 198, No. 1, pp. 268 – 286, (2007).
16. Erturk, E., "Numerical Performance of Compact Fourth Order Formulation of the Navier-Stokes equations", Communications in Numerical Methods in Engineerin, Vol. 24, pp. 2003-2019, (2008).
CAPTCHA Image