The Sniper User ManualTrevor E. Carlson Wim HeirmanNovember 13, 2013sniper
run-sniper generates an mpiexec command line using mpiexec and thenumber of MPI ranks (-np). By default, one MPI rank is run per core. Thiscan be over
advancing too quickly in respect to the others. When using barrier synchro-nization via the clock skew minimization/scheme=barrier configurationoption,
HOO K_ SY SC AL L_ EN TE RHOO K_ SY SC AL L_ EXITHOOK _A PP LI CA TI ON _S TA RTHOO K_APP LI CA TI ON _E XI THO OK _A PP LI CA TI O N_ RO I_ BE GI NHO
Listing 17: Hierarchical Section Config File[ perf _mo del / core / i nte rv al _ti me r ]win dow _si ze =965.1 Configuration FilesThe method we most of
Listing 21: Example configuration file[ perf _mo del / core ]freq uen cy = 2.66 # Set the de fault valuefreq uen cy [] = 1.0 , , ,1.0 # Core 1 ,2 uses t
6.1.2 CachesDescription Example OptionNumber of cache levels perf model/cache/levels=3L1-I options perf model/l1 icache/*L1-D options perf model/l1 dc
# Setup the core DVFS tr ans iti on laten cy[ dvfs ]tra ns it io n_ la te nc y = 10000 # In ns# Con fig ure 4- core DVFS gra nul arity[ dvfs / simple
"""import sys , os , simclass Dvfs :def setup ( self , args ):self . events = []args = args . split ( ’: ’)for i in range (0 , len ( ar
7.3 Loop TracerThe loop tracer allows one to determine the steady-state performance ofan application loop. To use it, configure Sniper with the paramet
Figure 1: CPI stack over time for Splash-2 FFT with 2 threads in thedetailed, normalized view. The application was run in Sniper with thegainestown co
Contents1 Introduction 21.1 What is Sniper? . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Features . . . . . . . . . . . . . . . . . . . . .
Figure 2: CPI stack over time for Splash-2 FFT with 2 cores running withthe gainestown configuration in Sniper.an application is not making significant
Figure 3: Topology of the gainestown microarchitecture with a sparklineshowing the misses per 1000 instructions (MPKI) of the first L1 data cache.The s
-c <sniper-config> — Configuration file(s), see Section 5.1-c [objname:]<name[.cfg]>,<name2[.cfg]>,... — Setup a het-erogeneous configu
--save-patch — Save a patch (to sim.patch) with the current Snipercode differences--pin-stats — Enable basic pin statists. Normally saves to pin.log--m
-o <file> — Save gnuplot plotted data to ¡file¿.png--simplified — Create a CPI stack merging all items into the fol-lowing categories: compute, c
-f <int> — Number of instructions to fast forward before starting tocreate the trace-d <int> — Number of instructions to record in the tra
8.4.2 tools/viz/level2.pyGenerate the visualization of the cycle stacks over time.tools/viz/level2.py [-h|--help (help)] [-d <resultsdir (default:.
9 Comprehensive Option List9.1 Base Sniper OptionsListing 27: Base options (base.cfg)# Co nf igu ra tio n file for the Sniper simulat or# This file is
pin _c od ec ac he _t ra ce = false[ pro gr ess _t ra ce ]enabled = falseinterval = 5000filename = ""[ cl oc k_ sk ew _m in im iz at io n ]s
recv =1sync =0spawn =0tlb_miss =0mem_ access =0delay =0[ perf _mo del / b ra nch _p re di ct or ]type = one_ bitmis pr ed ic t_ pe na lt y =14 # A gue
9 Comprehensive Option List 269.1 Base Sniper Options . . . . . . . . . . . . . . . . . . . . . . . 269.2 Options used to configure the Nehalem core .
cac he _b lo ck_si ze = 64cach e_size = 32 # in KBass oc iat iv ity = 4add res s_ has h = maskrep la ce me nt _p ol ic y = lrudat a_ ac ce ss_ti me =
[ core / hoo k_ pe ri od ic _ins ]ins _pe r_ cor e = 10000 # After how many i ns tru cti on s should eachcore incr eme nt the global HPI count erins_
[ perf _mo del / dram / queu e_m ode l ]enabled = truetype = his to ry_ lis t[ perf _mo del / nuca ]enabled = false[ perf _mo del / sync ]res ch ed ul
[ queu e_ mod el / hi sto ry_li st ]# Uses the analytical model ( if enabled ) to c alculat e delay ifcannot be c alc ula ted using the history listma
[ schedule r ]type = pinned[ schedule r / pinned ]quantum = 1 00000 0 # Sch edu ler quan tum ( round - robin foractive threa ds on each core ) , in na
[ perf _mo del / core / i nte rv al _ti me r ]dis pa tch _w id th = 4win dow _si ze = 128nu m_ ou ts ta nd in g _l oa ds to re s = 10[ perf _mo del /
tag s_ ac ce ss_ti me = 1per f_ mo del _t yp e = paral lelwri tet hr oug h = 0sha red _c ore s = 1[ perf _mo del / l2_cac he ]perfect = falsecach e_si
# See http :// en . wi kipedia . org / wiki / Gaines tow n_ ( mi cr opr oc es sor ) #Gain estown# and http :// ark . intel . com / prod ucts /3 7106#i
9.4 Sniper Prefetcher OptionsListing 30: Prefetcher options (prefetcher.cfg)[ perf _mo del / l2_cac he ]# pref etc her = simplepref etcher = ghb[ perf
# define SI M_ OP T_ I NS TR UM EN T _F AS TF OR W AR D 2// SimAPI commandsSim Roi Sta rt ()SimR oiE nd ()Sim Get Pr ocI d ()Sim Ge tTh re ad Id ()Sim
• CPI Stack generation• Parallel, multi-threaded simulator• Multi-threaded application support• Multi-program workload support with the SIFT trace for
2.2 Compiling SniperOne can compile Sniper with the following command: cd sniper && make.If you have multiple processors, you can take advanta
In addition to viewing the sim.out file, we encourage the use of thesniper/tools/sniper lib.py:get results() function to parse and pro-cess results. Th
• Detailed — Fully detailed models of cores, caches and interconnectionnetwork• Cache-only — No core model (simulated time does not advance!),only sim
3.5 Using Your Own BenchmarksSniper can run most applications out of the box. Both static and dynamicbinaries are supported, and no special recompilat
$ ./ record - trace -o fft -b 1000000 -- test / fft / fft - p1 - m20This command will generate a number of SIFT files with names in theformat of <na
Comments to this Manuals