Submitted mz10-009.in on 8 ppn. It failed as not enough memory was allocated (only ~2000MB vs. the ~10,000MB needed). To rectify this, added -l pvmem=5GB to the "sanity check" submission file and cut the ppn from 8 to 4. Also, in the input file, set mkmem 0 (For more info on memory constraints, https://www.nersc.gov/users/computational-systems/carver/running-jobs/memory-considerations/).
Adding only mkmem 0 reduced the memory needs 10-fold. I'm going to try running the job without pvmem=5GB and with 8 ppn.
Submitted the job, but it appears that I have hit my disk quota. That means I'll have to clear out some space in my folders at some point in the next 12-24 hours.
I had both this job and mz464 running, but the both of them together filled my home quota, so both jobs stalled out. I moved the critical files over to $SCRATCH, and resubmitted this job (mz10-009) with the specs:
Submitted mz10-009.in to reg_med queue
Run on 32 nodes, 8 ppn
walltime of 36:00:00
ecut 25 Hartrees
I also thought about submitting and then holding a job using more 64 nodes, but the queue kept rejecting this input file. I'll ask Sam and Josh about why that might be happening, but for now the current job should (hopefully) suffice.
Met with Josh today, and found a major error in my methodology. It turns out that in the submission file, the number of processors given over to abinit is specified by the command mpirun -np ## (where ## is the number of processors). I had not been changing this, so basically with all jobs up to this point I was queued for large numbers of nodes, but only using one. With this error rectificed, this job was correctly resubmitted under the following parameters.
Run on 33 nodes, 8 ppn
walltime of 24:00:00
ecut 35 Hartrees
mz10-009 came out of the queue, it looks like everything went well. Next week, I'll give the iterative Hirshfeld calculations a shot.
Converged after 18 SCF cycles
nband 307 Hartrees
ngfft 100 120 160