HPCG-CUDA-Benchmark version=1.0.0 HPCG-Benchmark version=3.1 Machine Summary= Machine Summary::Distributed Processes=608 Machine Summary::Threads per processes=6 Global Problem Dimensions= Global Problem Dimensions::Global nx=1536 Global Problem Dimensions::Global ny=3072 Global Problem Dimensions::Global nz=7296 Processor Dimensions= Processor Dimensions::npx=4 Processor Dimensions::npy=8 Processor Dimensions::npz=19 Local Domain Dimensions= Local Domain Dimensions::nx=384 Local Domain Dimensions::ny=384 Local Domain Dimensions::Lower ipz=0 Local Domain Dimensions::Upper ipz=18 Local Domain Dimensions::nz=384 ########## Problem Summary ##########= Setup Information= Setup Information::Setup Time=0.443586 Linear System Information= Linear System Information::Number of Equations=34426847232 Linear System Information::Number of Nonzero Terms=928834924024 Multigrid Information= Multigrid Information::Number of coarse grid levels=3 Multigrid Information::Coarse Grids= Multigrid Information::Coarse Grids::Grid Level=1 Multigrid Information::Coarse Grids::Number of Equations=4303355904 Multigrid Information::Coarse Grids::Number of Nonzero Terms=116018157304 Multigrid Information::Coarse Grids::Number of Presmoother Steps=1 Multigrid Information::Coarse Grids::Number of Postsmoother Steps=1 Multigrid Information::Coarse Grids::Grid Level=2 Multigrid Information::Coarse Grids::Number of Equations=537919488 Multigrid Information::Coarse Grids::Number of Nonzero Terms=14480731000 Multigrid Information::Coarse Grids::Number of Presmoother Steps=1 Multigrid Information::Coarse Grids::Number of Postsmoother Steps=1 Multigrid Information::Coarse Grids::Grid Level=3 Multigrid Information::Coarse Grids::Number of Equations=67239936 Multigrid Information::Coarse Grids::Number of Nonzero Terms=1804713400 Multigrid Information::Coarse Grids::Number of Presmoother Steps=1 Multigrid Information::Coarse Grids::Number of Postsmoother Steps=1 ########## Memory Use Summary ##########= Memory Use Information= Memory Use Information::Total memory used for data (Gbytes)=24610.2 Memory Use Information::Memory used for OptimizeProblem data (Gbytes)=0 Memory Use Information::Bytes per equation (Total memory / Number of Equations)=714.856 Memory Use Information::Memory used for linear system and CG (Gbytes)=21658.8 Memory Use Information::Coarse Grids= Memory Use Information::Coarse Grids::Grid Level=1 Memory Use Information::Coarse Grids::Memory used=2587.4 Memory Use Information::Coarse Grids::Grid Level=2 Memory Use Information::Coarse Grids::Memory used=323.562 Memory Use Information::Coarse Grids::Grid Level=3 Memory Use Information::Coarse Grids::Memory used=40.4799 ########## V&V Testing Summary ##########= Spectral Convergence Tests= Spectral Convergence Tests::Result=PASSED Spectral Convergence Tests::Unpreconditioned= Spectral Convergence Tests::Unpreconditioned::Maximum iteration count=11 Spectral Convergence Tests::Unpreconditioned::Expected iteration count=12 Spectral Convergence Tests::Preconditioned= Spectral Convergence Tests::Preconditioned::Maximum iteration count=2 Spectral Convergence Tests::Preconditioned::Expected iteration count=2 Departure from Symmetry |x'Ay-y'Ax|/(2*||x||*||A||*||y||)/epsilon= Departure from Symmetry |x'Ay-y'Ax|/(2*||x||*||A||*||y||)/epsilon::Result=PASSED Departure from Symmetry |x'Ay-y'Ax|/(2*||x||*||A||*||y||)/epsilon::Departure for SpMV=0 Departure from Symmetry |x'Ay-y'Ax|/(2*||x||*||A||*||y||)/epsilon::Departure for MG=0 ########## Iterations Summary ##########= Iteration Count Information= Iteration Count Information::Result=PASSED Iteration Count Information::Reference CG iterations per set=50 Iteration Count Information::Optimized CG iterations per set=56 Iteration Count Information::Total number of reference iterations=20800 Iteration Count Information::Total number of optimized iterations=23296 ########## Reproducibility Summary ##########= Reproducibility Information= Reproducibility Information::Result=PASSED Reproducibility Information::Scaled residual mean=0.00478523 Reproducibility Information::Scaled residual variance=0 ########## Performance Summary (times in sec) ##########= Benchmark Time Summary= Benchmark Time Summary::Optimization phase=0.329409 Benchmark Time Summary::DDOT=102.006 Benchmark Time Summary::WAXPBY=71.5587 Benchmark Time Summary::SpMV=374.115 Benchmark Time Summary::MG=1417.44 Benchmark Time Summary::Total=1965.46 Floating Point Operations Summary= Floating Point Operations Summary::Raw DDOT=4.84069e+15 Floating Point Operations Summary::Raw WAXPBY=4.84069e+15 Floating Point Operations Summary::Raw SpMV=4.40491e+16 Floating Point Operations Summary::Raw MG=2.46951e+17 Floating Point Operations Summary::Total=3.00681e+17 Floating Point Operations Summary::Total with convergence overhead=2.68465e+17 GB/s Summary= GB/s Summary::Raw Read B/W=942244 GB/s Summary::Raw Write B/W=217799 GB/s Summary::Raw Total B/W=1.16004e+06 GB/s Summary::Total with convergence and optimization phase overhead=1.01908e+06 GFLOP/s Summary= GFLOP/s Summary::Raw DDOT=47454.8 GFLOP/s Summary::Raw WAXPBY=67646.4 GFLOP/s Summary::Raw SpMV=117742 GFLOP/s Summary::Raw MG=174223 GFLOP/s Summary::Raw Total=152982 GFLOP/s Summary::Total with convergence overhead=136591 GFLOP/s Summary::Total with convergence and optimization phase overhead=134393 User Optimization Overheads= User Optimization Overheads::Optimization phase time (sec)=0.329409 User Optimization Overheads::Optimization phase time vs reference SpMV+MG time=0.0470215 DDOT Timing Variations= DDOT Timing Variations::Min DDOT MPI_Allreduce time=9.602 DDOT Timing Variations::Max DDOT MPI_Allreduce time=76.3247 DDOT Timing Variations::Avg DDOT MPI_Allreduce time=39.6966 Final Summary= Final Summary::HPCG result is VALID with a GFLOP/s rating of=134393 Final Summary::HPCG 2.4 rating for historical reasons is=135646 Final Summary::Please upload results from the YAML file contents to=http://hpcg-benchmark.org