vasp的编译安装可以说是千奇百怪,我采用的是ivf的编译器和mpich2进行最简单的编译,系统是ubuntu14.04,不得不说,就是因为这个系统,当时浪费了我一个星期的时间进行编译。
首先,安装ifort,选择ifort的一大原因是因为ifort有免费适用版,可以申请到一年的适用权。先用邮箱申请一个lic,Intel会把ID和下载地址通过邮箱告诉你,下载下来正常安装即可(教育网速度尚可,能有10M/s的速度,业界良心啊。)
其次,安装mpich2,ubuntu源里自带
Bash |copy code |?
1 sudo apt-get install mpich2
对于vasp的源代码,分两步编译,一步是编译vasp.5.lib,这个只需要把makefile_ivf复制成makefile文件,把gcc和ivf换成icc和ifort即可。这部分没什么难得。
编译并行vasp程序的makefile文件首先用makefile_linux_ivf复制成makefile文件。做如下修改:
53~55行注视掉;70行(CPP_开头)把“-C”参数删掉,否则无法正常编译,这个问题纠结我一个多星期的关键所在,而且网上也没有说明;注释点BALS行,添加自己intel编译器mkl库的位置;LAPACK行留空;从mpi开始,打开相应的编译项。
最终的vasp并行编译的makefile模版如下:
GNU make |copy code |?
001 .SUFFIXES: .inc .f .f90 .F002 #-----------------------------------------------------------------------
003 # Makefile for Intel Fortran compiler for Pentium/Athlon/Opteron
004 # bases systems
005 # we recommend this makefile for both Intel as well as AMD systems
006 # for AMD based systems appropriate BLAS and fftw libraries are
007 # however mandatory (whereas they are optional for Intel platforms)
008 #
009 # The makefile was tested only under Linux on Intel and AMD platforms
010 # the following compiler versions have been tested:
011 # - ifc.7.1 works stable somewhat slow but reliably
012 # - ifc.8.1 fails to compile the code properly
013 # - ifc.9.1 recommended (both for 32 and 64 bit)
014 # - ifc.10.1 partially recommended (both for 32 and 64 bit)
015 # tested build 20080312 Package ID: l_fc_p_10.1.015
016 # the gamma only mpi version can not be compiles
017 # using ifc.10.1
018 #
019 # it might be required to change some of library pathes, since
020 # LINUX installation vary a lot
021 # Hence check ***ALL*** options in this makefile very carefully
022 #-----------------------------------------------------------------------
023 #
024 # BLAS must be installed on the machine
025 # there are several options:
026 # 1) very slow but works:
027 # retrieve the lapackage from ftp.netlib.org
028 # and compile the blas routines (BLAS/SRC directory)
029 # please use g77 or f77 for the compilation. When I tried to
030 # use pgf77 or pgf90 for BLAS, VASP hang up when calling
031 # ZHEEV (however this was with lapack 1.1 now I use lapack 2.0)
032 # 2) more desirable: get an optimized BLAS
033 #
034 # the two most reliable packages around are presently:
035 # 2a) Intels own optimised BLAS (PIII, P4, PD, PC2, Itanium)
036 # http://developer.intel.com/software/products/mkl/
037 # this is really excellent, if you use Intel CPU's
038 #
039 # 2b) probably fastest SSE2 (4 GFlops on P4, 2.53 GHz, 16 GFlops PD,
040 # around 30 GFlops on Quad core)
041 # Kazushige Goto's BLAS
042 # http://www.cs.utexas.edu/users/kgoto/signup_first.html
043 # http://www.tacc.utexas.edu/resources/software/
044 #
045 #-----------------------------------------------------------------------
046 047 # all CPP processed fortran files have the extension .f90
048 SUFFIX=.f90
049 050 #-----------------------------------------------------------------------
051 # fortran compiler and linker
052 #-----------------------------------------------------------------------
053 #FC=ifort
054 # fortran linker
055 #FCL=$(FC)
056 057 058 #-----------------------------------------------------------------------
059 # whereis CPP -- (I need CPP, can't use gcc with proper options)
060 # that's the location of gcc for SUSE 5.3
061 #
062 # CPP_ = /usr/lib/gcc-lib/i486-linux/2.7.2/cpp -P -C
063 #
064 # that's probably the right line for some Red Hat distribution:
065 #
066 # CPP_ = /usr/lib/gcc-lib/i386-redhat-linux/2.7.2.3/cpp -P -C
067 #
068 # SUSE X.X, maybe some Red Hat distributions:
069 070 CPP_ = ./preprocess <$*.F | /usr/bin/cpp -P -traditional >$*$(SUFFIX)071 072 #-----------------------------------------------------------------------
073 # possible options for CPP:
074 # NGXhalf charge density reduced in X direction
075 # wNGXhalf gamma point only reduced in X direction
076 # avoidalloc avoid ALLOCATE if possible
077 # PGF89 work around some for some PGF90 / IFC bugs
078 # CACHE_SIZE 1000 for PII,PIII, 5000 for Athlon, 8000-12000 P4, PD
079 # RPROMU_DGEMV use DGEMV instead of DGEMM in RPRO (depends on used BLAS)
080 # RACCMU_DGEMV use DGEMV instead of DGEMM in RACC (depends on used BLAS)
081 # tbdyn MD package of Tomas Bucko
082 #-----------------------------------------------------------------------
083 084 #CPP = $(CPP_) -DHOST=\"LinuxIFC\" \
085 # -DCACHE_SIZE=12000 -DPGF90 -Davoidalloc -DNGXhalf \
086 # -DRPROMU_DGEMV -DRACCMU_DGEMV
087 088 #-----------------------------------------------------------------------
089 # general fortran flags (there must a trailing blank on this line)
090 # byterecl is strictly required for ifc, since otherwise
091 # the WAVECAR file becomes huge
092 #-----------------------------------------------------------------------
093 094 FFLAGS = -FR -lowercase -assume byterecl095 096 #-----------------------------------------------------------------------
097 # optimization
098 # we have tested whether higher optimisation improves performance
099 # -axK SSE1 optimization, but also generate code executable on all mach.
100 # xK improves performance somewhat on XP, and a is required in order
101 # to run the code on older Athlons as well
102 # -xW SSE2 optimization
103 # -axW SSE2 optimization, but also generate code executable on all mach.
104 # -tpp6 P3 optimization
105 # -tpp7 P4 optimization
106 #-----------------------------------------------------------------------
107 108 # ifc.9.1, ifc.10.1 recommended
109 OFLAG=-O2 -ip -ftz110 111 OFLAG_HIGH = $(OFLAG)112 OBJ_HIGH =
113 OBJ_NOOPT =
114 DEBUG = -FR -O0115 INLINE = $(OFLAG)116 117 #-----------------------------------------------------------------------
118 # the following lines specify the position of BLAS and LAPACK
119 # VASP works fastest with the libgoto library
120 # so that's what we recommend
121 #-----------------------------------------------------------------------
122 123 # mkl.10.0
124 # set -DRPROMU_DGEMV -DRACCMU_DGEMV in the CPP lines
125 #BLAS=-L/opt/intel/mkl100/lib/em64t -lmkl -lpthread
126 127 # even faster for VASP Kazushige Goto's BLAS
128 # http://www.cs.utexas.edu/users/kgoto/signup_first.html
129 # parallel goto version requires sometimes -libverbs
130 #BLAS= /opt/libs/libgoto/libgoto.so
131 BLAS=-L/opt/intel/composer_xe_2013_sp1.2.144/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_sequential -lpthread132 # LAPACK, simplest use vasp.5.lib/lapack_double
133 LAPACK=
134 135 # use the mkl Intel lapack
136 #LAPACK= -lmkl_lapack
137 138 #-----------------------------------------------------------------------
139 140 #LIB = -L../vasp.5.lib -ldmy \
141 # ../vasp.5.lib/linpack_double.o $(LAPACK) \
142 # $(BLAS)
143 #
144 # options for linking, nothing is required (usually)
145 LINK =
146 147 #-----------------------------------------------------------------------
148 # fft libraries:
149 # VASP.5.2 can use fftw.3.1.X (http://www.fftw.org)
150 # since this version is faster on P4 machines, we recommend to use it
151 #-----------------------------------------------------------------------
152 153 #FFT3D = fft3dfurth.o fft3dlib.o
154 155 # alternatively: fftw.3.1.X is slighly faster and should be used if available
156 #FFT3D = fftw3d.o fft3dlib.o /opt/libs/fftw-3.1.2/lib/libfftw3.a
157 158 159 #=======================================================================
160 # MPI section, uncomment the following lines until
161 # general rules and compile lines
162 # presently we recommend OPENMPI, since it seems to offer better
163 # performance than lam or mpich
164 #
165 # !!! Please do not send me any queries on how to install MPI, I will
166 # certainly not answer them !!!!
167 #=======================================================================
168 #-----------------------------------------------------------------------
169 # fortran linker for mpi
170 #-----------------------------------------------------------------------
171 172 FC=mpif90 -f90=ifort173 FCL=$(FC)174 175 #-----------------------------------------------------------------------
176 # additional options for CPP in parallel version (see also above):
177 # NGZhalf charge density reduced in Z direction
178 # wNGZhalf gamma point only reduced in Z direction
179 # scaLAPACK use scaLAPACK (usually slower on 100 Mbit Net)
180 # avoidalloc avoid ALLOCATE if possible
181 # PGF90 work around some for some PGF90 / IFC bugs
182 # CACHE_SIZE 1000 for PII,PIII, 5000 for Athlon, 8000-12000 P4, PD
183 # RPROMU_DGEMV use DGEMV instead of DGEMM in RPRO (depends on used BLAS)
184 # RACCMU_DGEMV use DGEMV instead of DGEMM in RACC (depends on used BLAS)
185 # tbdyn MD package of Tomas Bucko
186 #-----------------------------------------------------------------------
187 188 #-----------------------------------------------------------------------
189 190 CPP = $(CPP_) -DMPI -DHOST=\"LinuxIFC\" -DIFC \191 -DCACHE_SIZE=4000 -DPGF90 -Davoidalloc\192 -DMPI_BLOCK=8000193 ## -DRPROMU_DGEMV -DRACCMU_DGEMV -DNGZhalf
194 195 #-----------------------------------------------------------------------
196 # location of SCALAPACK
197 # if you do not use SCALAPACK simply leave that section commented out
198 #-----------------------------------------------------------------------
199 200 BLACS=$(HOME)/archives/SCALAPACK/BLACS/201 SCA_=$(HOME)/archives/SCALAPACK/SCALAPACK202 203 SCA= $(SCA_)/libscalapack.a \204 $(BLACS)/LIB/blacsF77init_MPI-LINUX-0.a $(BLACS)/LIB/blacs_MPI-LINUX-0.a $(BLACS)/LIB/blacsF77init_MPI-LINUX-0.a205 206 SCA=
207 208 #-----------------------------------------------------------------------
209 # libraries for mpi
210 #-----------------------------------------------------------------------
211 212 LIB = -L../vasp.5.lib -ldmy \213 ../vasp.5.lib/linpack_double.o $(LAPACK) \214 $(SCA) $(BLAS)215 216 # FFT: fftmpi.o with fft3dlib of Juergen Furthmueller
217 FFT3D = fftmpi.o fftmpi_map.o fft3dfurth.o fft3dlib.o218 219 # alternatively: fftw.3.1.X is slighly faster and should be used if available
220 #FFT3D = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o /opt/libs/fftw-3.1.2/lib/libfftw3.a
221 222 #-----------------------------------------------------------------------
223 # general rules and compile lines
224 #-----------------------------------------------------------------------
225 BASIC= symmetry.o symlib.o lattlib.o random.o226 227 228 SOURCE= base.o mpi.o smart_allocate.o xml.o \229 constant.o jacobi.o main_mpi.o scala.o \230 asa.o lattice.o poscar.o ini.o mgrid.o xclib.o vdw_nl.o xclib_grad.o \231 radial.o pseudo.o gridq.o ebs.o \232 mkpoints.o wave.o wave_mpi.o wave_high.o \233 $(BASIC) nonl.o nonlr.o nonl_high.o dfast.o choleski2.o \234 mix.o hamil.o xcgrad.o xcspin.o potex1.o potex2.o \235 constrmag.o cl_shift.o relativistic.o LDApU.o \236 paw_base.o metagga.o egrad.o pawsym.o pawfock.o pawlhf.o rhfatm.o paw.o \237 mkpoints_full.o charge.o Lebedev-Laikov.o stockholder.o dipol.o pot.o \238 dos.o elf.o tet.o tetweight.o hamil_rot.o \239 steep.o chain.o dyna.o sphpro.o us.o core_rel.o \240 aedens.o wavpre.o wavpre_noio.o broyden.o \241 dynbr.o rmm-diis.o reader.o writer.o tutor.o xml_writer.o \242 brent.o stufak.o fileio.o opergrid.o stepver.o \243 chgloc.o fast_aug.o fock.o mkpoints_change.o sym_grad.o \244 mymath.o internals.o dynconstr.o dimer_heyden.o dvvtrajectory.o vdwforcefield.o \245 hamil_high.o nmr.o pead.o mlwf.o subrot.o subrot_scf.o \246 force.o pwlhf.o gw_model.o optreal.o davidson.o david_inner.o \247 electron.o rot.o electron_all.o shm.o pardens.o paircorrection.o \248 optics.o constr_cell_relax.o stm.o finite_diff.o elpol.o \249 hamil_lr.o rmm-diis_lr.o subrot_cluster.o subrot_lr.o \250 lr_helper.o hamil_lrf.o elinear_response.o ilinear_response.o \251 linear_optics.o linear_response.o \252 setlocalpp.o wannier.o electron_OEP.o electron_lhf.o twoelectron4o.o \253 ratpol.o screened_2e.o wave_cacher.o chi_base.o wpot.o local_field.o \254 ump2.o bse_te.o bse.o acfdt.o chi.o sydmat.o dmft.o \255 rmm-diis_mlr.o linear_response_NMR.o256 257 vasp: $(SOURCE) $(FFT3D) $(INC) main.o258 rm -f vasp
259 $(FCL) -o vasp main.o $(SOURCE) $(FFT3D) $(LIB) $(LINK)260 makeparam: $(SOURCE) $(FFT3D) makeparam.o main.F $(INC)261 $(FCL) -o makeparam $(LINK) makeparam.o $(SOURCE) $(FFT3D) $(LIB)262 zgemmtest: zgemmtest.o base.o random.o $(INC)263 $(FCL) -o zgemmtest $(LINK) zgemmtest.o random.o base.o $(LIB)264 dgemmtest: dgemmtest.o base.o random.o $(INC)265 $(FCL) -o dgemmtest $(LINK) dgemmtest.o random.o base.o $(LIB)266 ffttest: base.o smart_allocate.o mpi.o mgrid.o random.o ffttest.o $(FFT3D) $(INC)267 $(FCL) -o ffttest $(LINK) ffttest.o mpi.o mgrid.o random.o smart_allocate.o base.o $(FFT3D) $(LIB)268 kpoints: $(SOURCE) $(FFT3D) makekpoints.o main.F $(INC)269 $(FCL) -o kpoints $(LINK) makekpoints.o $(SOURCE) $(FFT3D) $(LIB)270 271 clean:
272 -rm -f *.g *.f *.o *.L *.mod ; touch *.F273 274 main.o: main$(SUFFIX)275 $(FC) $(FFLAGS)$(DEBUG) $(INCS) -c main$(SUFFIX)276 xcgrad.o: xcgrad$(SUFFIX)277 $(FC) $(FFLAGS) $(INLINE) $(INCS) -c xcgrad$(SUFFIX)278 xcspin.o: xcspin$(SUFFIX)279 $(FC) $(FFLAGS) $(INLINE) $(INCS) -c xcspin$(SUFFIX)280 281 makeparam.o: makeparam$(SUFFIX)282 $(FC) $(FFLAGS)$(DEBUG) $(INCS) -c makeparam$(SUFFIX)283 284 makeparam$(SUFFIX): makeparam.F main.F285 #
286 # MIND: I do not have a full dependency list for the include
287 # and MODULES: here are only the minimal basic dependencies
288 # if one strucuture is changed then touch_dep must be called
289 # with the corresponding name of the structure
290 #
291 base.o: base.inc base.F292 mgrid.o: mgrid.inc mgrid.F293 constant.o: constant.inc constant.F294 lattice.o: lattice.inc lattice.F295 setex.o: setexm.inc setex.F296 pseudo.o: pseudo.inc pseudo.F297 poscar.o: poscar.inc poscar.F298 mkpoints.o: mkpoints.inc mkpoints.F299 wave.o: wave.F300 nonl.o: nonl.inc nonl.F301 nonlr.o: nonlr.inc nonlr.F302 303 $(OBJ_HIGH):304 $(CPP)305 $(FC) $(FFLAGS) $(OFLAG_HIGH) $(INCS) -c $*$(SUFFIX)306 $(OBJ_NOOPT):307 $(CPP)308 $(FC) $(FFLAGS) $(INCS) -c $*$(SUFFIX)309 310 fft3dlib_f77.o: fft3dlib_f77.F311 $(CPP)312 $(F77) $(FFLAGS_F77) -c $*$(SUFFIX)313 314 .F.o:315 $(CPP)316 $(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)317 .F$(SUFFIX):318 $(CPP)319 $(SUFFIX).o:320 $(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)321 322 # special rules
323 #-----------------------------------------------------------------------
324 # these special rules are cummulative (that is once failed
325 # in one compiler version, stays in the list forever)
326 # -tpp5|6|7 P, PII-PIII, PIV
327 # -xW use SIMD (does not pay of on PII, since fft3d uses double prec)
328 # all other options do no affect the code performance since -O1 is used
329 330 fft3dlib.o : fft3dlib.F331 $(CPP)332 $(FC) -FR -lowercase -O2 -c $*$(SUFFIX)333 334 fft3dfurth.o : fft3dfurth.F335 $(CPP)336 $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)337 338 fftw3d.o : fftw3d.F339 $(CPP)340 $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)341 342 wave_high.o : wave_high.F343 $(CPP)344 $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)345 346 radial.o : radial.F347 $(CPP)348 $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)349 350 symlib.o : symlib.F351 $(CPP)352 $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)353 354 symmetry.o : symmetry.F355 $(CPP)356 $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)357 358 wave_mpi.o : wave_mpi.F359 $(CPP)360 $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)361 362 wave.o : wave.F363 $(CPP)364 $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)365 366 dynbr.o : dynbr.F367 $(CPP)368 $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)369 370 asa.o : asa.F371 $(CPP)372 $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)373 374 broyden.o : broyden.F375 $(CPP)376 $(FC) -FR -lowercase -O2 -c $*$(SUFFIX)377 378 us.o : us.F379 $(CPP)380 $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)381 382 LDApU.o : LDApU.F383 $(CPP)384 $(FC) -FR -lowercase -O2 -c $*$(SUFFIX)
1
管理员 hsyyf: 2014年08月19日 下午4:43 ∇地下1层
管理员 hsyyf: 2014年08月19日 下午10:27 ∇地下2层
管理员 hsyyf: 2014年08月19日 下午10:27 ∇地下3层
管理员 hsyyf: 2014年08月19日 下午11:49 ∇地下3层