Welcome to Prism‘s Documentation!¶
Prism is a technology for building platform-agnostic workload analysis tools. Tools are built once and are able to run across multiple architectures and environments. Prism targets complex analyses that are latency-tolerant, in contrast to real-time analyses.
Prism aims to improve three main components of designing new analysis tools for research: 1) modularity, 2) design flexibility, and 3) productivity.
Overview¶
Prism is a framework designed to help analyze dynamic behavior in applications. This dynamic behavior, or workload, is a result of the application and its given inputs and state.
Workloads¶
One of the main goals behind Prism is providing a straightforward interface to intuitively represent and analyze workloads. A workload can be represented in many ways. Each way has different requirements.
For example, you can represent a workload as a simple assembly instruction trace...:
push %rbp
push %rbx
mov %rsi,%rbp
mov %edi,%ebx
sub $0x8,%rsp
callq 4377b0 <_Z17myfuncv>
callq 4261e0 <_ZN5myotherfunc>
mov %rbp,%rdx
mov %ebx,%esi
mov %rax,%rdi
callq 422460 <_ZN5GO>
add $0x8,%rsp
xor %eax,%eax
pop %rbx
pop %rbp
retq
...or a call graph...:
...or a memory trace...:
ADDR BYTES
0xdeadbeef 8
0x12345678 4
0x00000000 1
...
...or more complex representations. Fundamentally, all workload representations can be broken down into five event primitives.
Event Primitives¶
Because of the variety of use-cases being supported, Prism presents workloads as a set of extensible primitives.
Event Primitive | Description |
---|---|
Compute | some transformation of data |
Memory | some movement of data |
Control Flow | divergence in an event stream |
Synchronization | ordering between separate event streams |
Context | grouping of events |
E.g., an abstract workload is represented as:
...
compute FLOP, add, SIMD4
memory write, 4B, <addr1>
memory read, 16B, <addr2>
context func, enter, hello_world_thread
sync create, <TID1>
...
Event Generation¶
Many tools exist to capture workloads. Currently, Valgrind is well supported. DynamoRIO is on its way to good support, and we are experimenting with traces captured with hardware features.
Eventually, we aim to support a broad spectrum of tools to support many applications and hardware architectures, e.g.:
- static instrumentation tools
- dynamic binary instrumentation tools
- hardware performance counter sampling
- architecture-specific
- simulation probes
- and others
Each framework has its merits depending on the desired granularity and source of the event trace. Most binary instrumentation frameworks do a good job of obvserving the instruction stream of general purpose CPU workloads, but incur large overheads and may perturb results. Hardware support is good for real-time capture, but may have trouble capturing a native sized workload. Execution-driven simulators are great for fine-grained, low-level traces, but simulation time may be intractable for very large workloads, and simulators obviously must support the application. Additional capture methodologies exist for applications designed in interpreted or managed languages.
Prism recognizes these trade-offs and creates an abstraction to the underlying framework that observes the workload. Events are translated into Prism event primitives, which are then presented to the user for further analysis or simple trace-generation. The component used in a given framework for event generation is a Prism frontend, and the user-defined analysis or trace-generation on those events is a Prism backend. Currently, backends are written as C++ static plugins to Prism. We are interested in expanding support to C++ dynamic libraries and additionally python bindings.
Getting Started¶
Congrats on getting this far! 🎉
This portion of the documentation will walk you through setting up Prism and creating your first tool. Onwards! 🚀
Quickstart¶
This page will quickly walk you through building and running Prism.
Building Prism¶
Note
The default compiler for CentOS 7 and older (gcc <5) does not support C++14. Install and enable the offical Devtoolset before compiling.
Clone and build Prism from source:
$ git clone https://github.com/vandal/prism
$ cd prism
$ mkdir build && cd build
$ cmake3 .. # CentOS 7 requires cmake3 package
$ make -j
This creates a build/bin
folder containing the prism executable.
It can be run in place, or the entire bin
folder can be moved,
although it’s not advised to move it to a system location.
Running Prism¶
Prism requires at least two arguments: the backend
analysis tool,
and the executable
application to measure:
$ bin/prism --backend=stgen --executable=./mybinary
The backend
is the analysis tool that will analyze the requested events
in mybinary
. In this example, stgen
is the backend that processes
events into a special event trace that is used in SynchroTrace.
A third option frontend
will change the underlying method
for observing the application. By default, this is Valgrind:
$ bin/prism --frontend=valgrind --backend=stgen --executable=./mybinary
Dependencies¶
PACKAGE | VERSION |
---|---|
gcc/g++ | 5+ |
cmake | 3.1.3+ |
make | 3.8+ |
automake | 1.13+ |
autoconf | 2.69+ |
zlib/zlib-dev | 1.27+ |
git | 1.8+ |
Building your first Prism tool¶
This example will demonstrate how to get started analyzing a workload. We’ll generate a simple tool that counts the number of memory events in a workload.
Writing Your Tool¶
First, let’s make a new folder for our backend, called EventCounter
,
and begin making the backend.
$ cd prism
$ mkdir src/Backends/EventCounter
$ touch src/Backends/EventCounter/EventCounter.hpp
Currently, all backends are created in C++, and inherit from a BackendIface
class.
// EventCounter.hpp
#include "Core/Backends.hpp"
class EventHandler : public BackendIface { };
By default, each event is ignored.
Let’s override
this behavior and keep count of how many memory events pass.
// EventCounter.hpp
#include "Core/Backends.hpp"
class EventHandler : public BackendIface
{
virtual void override onMemEv(const sigil2::MemEvent &ev) {
memory_total++;
}
unsigned memory_total{0};
};
We keep track of the total memory count in a private class variable, memory_total
.
If multiple event streams are enabled, a new class instance is created for each stream.
This means we won’t be totalling events from the entire workload! We’ll use a naive approach is to use an atomic variable that all EventCounter instances can access.
// EventCounter.hpp
#include "Core/Backends.hpp"
#include <atomic>
extern std::atomic<unsigned> global_memory_total;
class EventHandler : public BackendIface
{
virtual void override onMemEv(const sigil2::MemEvent &ev) {
global_memory_total++;
}
};
Now let’s optimize our EventHandler
to only update our atomic global
once at the end when the destructor is called, instead of at every memory event.
We’ll also include the two extra functions:
- an event
requirements
function, to let Prism know to generate memory events - a
cleanup
function, that executes after all event generation and event analysis has been performed.
$ touch src/Backends/EventCounter/EventCounter.cpp
// EventCounter.hpp
#ifndef EVENTCOUNTER_H
#define EVENTCOUNTER_H
#include "Core/Backends.hpp"
#include <atomic>
// forward function declarations
void cleanup(void);
sigil2::capabilities requirements(void);
// global memory event counter
extern std::atomic<unsigned> global_memory_total;
class EventHandler : public BackendIface
{
~EventHandler() {
global_memory_total += memory_total;
}
virtual void override onMemEv(const sigil2::MemEvent &ev) {
memory_total++;
}
unsigned memory_total{0};
};
#endif
// EventCounter.cpp
#include "EventCounter.hpp"
#include <iostream>
std::atomic<unsigned> global_memory_total{0};
// Event Request
sigil2::capabilities requirements()
{
using namespace sigil2;
using namespace sigil2::capability;
auto caps = initCaps();
caps[MEMORY] = availability::enabled;
return caps;
}
// Final Clean up call
void cleanup()
{
std::cout << "Total Memory Events: " << global_memory_total << std::endl;
}
Registering Your Tool¶
Let’s setup our new tool in Prism. Prism uses static plugins at the moment. This requires altering a bit of Prism source code, but is easier to maintain as a small project.
$ cd src/Core
$ $EDITOR main.cpp
// main.cpp
int main(int argc, char* argv[])
{
auto config = Config()
.registerFrontend(/* ... */)
// register more frontends
.registerBackend(/* ... */)
// register more backends
.parseCommandLine(argc, argv);
return startPrism(config);
}
We can see all enabled backends and frontends here in one spot. This is clear and efficient when working with a smaller number of tools. Let’s register our backend.
// main.cpp
int main(int argc, char* argv[])
{
auto config = Config()
.registerFrontend(/* ... */)
// register more frontends
.registerBackend(/* ... */)
// register more backends
.registerBackend("EventCounter",
{[]{return std::make_unique<::EventHandler>();},
{},
::cleanup,
::requirements})
.parseCommandLine(argc, argv);
return startPrism(config);
}
The registerBackend
member function takes 5 arguments:
- The name of the tool—this is used in the command line option.
- A function that returns a new instance of our event handler—we’ll use an anonymous function.
- A function to take any extra command line options—we aren’t using this so it’ll stay blank.
- An end function that is called after all events have been passed to the tool.
- A function that returns a set of events required by the Prism tool.
Now let’s make sure the build system knows about our tool. We need to add our tool as a static library to Prism.
$ cd src/Backend/EventCounter
$ cat > CMakeLists.txt <<EOF
> set(TOOLNAME EventCounter)
> set(SOURCES EventCounter.cpp)
>
> add_library(${TOOLNAME} STATIC ${SOURCES})
> set(PRISM_TOOLS_LIBS ${TOOLNAME} PARENT_SCOPE)
> EOF
And now we recompile Prism:
$ cd build
$ cmake ..
$ make -j
Running Your Tool¶
The new tool can be invoked as:
$ cd build
$ bin/prism --backend=EventCounter --executable=ls
The Profiling Frontend¶
A frontend is the component that is generating the event stream. By default, this is Valgrind (mostly due to historical reasons).
While it’s tempting to assume that the event generation just works™ you should be aware of the intrinsic nature of the chosen frontend before making any large assumptions.
Valgrind¶
Valgrind is the default frontend. No additional options are required. The following two command lines are equivalent.
$ bin/sigil2 --backend=simplecount --executable=ls -lah
$ bin/sigil2 --frontend=valgrind --backend=simplecount --executable=ls -lah
Valgrind is a copy & annotate dynamic binary instrumentation tool. This means that the dynamic instruction stream is grouped into blocks, disassembled into Valgrind’s VEX IR, instrumented, and then recompiled just-in-time.
DynamoRIO¶
DynamoRIO is not built with Prism by default. To enable DynamoRIO as a frontend, build Prism using the following cmake build command:
$ cmake .. -DCMAKE_BUILD_TYPE=release -DENABLE_DRSIGIL:bool=true
DynamoRIO can now be invoked as a frontend:
$ bin/sigil2 --frontend=dynamorio --backend=simplecount --executable=ls -lah
DynamoRIO’s IR exists closer to the ISA than the IR used by Valgrind. Prism converts DynamoRIO IR to event primitives by inspection of each opcode.
Todo
mmm475 to fill in more details
Events Documentation¶
Events List¶
Five event primitives:
- memory
- compute
- synchronization
- context
- control flow
Memory¶
Attribute | Details |
---|---|
Type | none
read
write
|
Address | numeric |
Size (Bytes) | numeric |
Compute¶
Attribute | Details |
---|---|
Type | Integer Operation (IOP)
Floating Point Operation (FLOP)
|
Arity | numeric |
Size | numeric |
Cost Operation | add
sub
mult
div
shift
mov
|
Synchronization¶
Attribute | Details |
---|---|
Type | none
spawn
join
barrier
sync
swap
lock
unlock
conditional wait
conditional signal
conditional broadcast
spin lock
spin unlock
|
data1 | numeric |
data2 | numeric |
Todo
data1/2 is currently a hack for SynchroTraceGen. Eventually we want to have the amount of data change depending on Type. Each datum is not necessarily used, depending on the Type. Ideally the amount of data tupled in the event will depend on its Type, but it’s faster to iterate over when there’s a definitive size.
Context¶
Attribute | Details |
---|---|
Type | none
instruction
basic Block
function Enter
function Exit
thread
|
id
name (function)
|
numeric
string
|
Todo
Currently threads are delimited in the event stream with a Sync-Swap event. This should eventually move to a Cxt-Thread event, since the event does not strictly order the threads, and is intended to just group events that follow it.
Control Flow¶
Note
Control Flow is currently not implemented. This table is intended as a guide for future support.
Attribute | Details | |
---|---|---|
Type | jump
call
return
suspend
|
|
Conditional | true
false
|
condition |
Destination Type | instruction
other
|
|
Destination | numeric |
Notes¶
Backend Documentation¶
SimpleCount¶
Synopsis¶
$ bin/sigil2 --frontend=FRONTEND --backend=simplecount --executable=mybinary -myoptions
Description¶
SimpleCount is a demonstrative backend that counts each event type received from a given frontend. These events are aggregated across all threads.
SynchroTraceGen¶
Synopsis¶
$ bin/sigil2 --frontend=FRONTEND --backend=stgen OPTIONS --executable=mybinary -myoptions
Description¶
SynchroTraceGen is a frontend for generating trace files for the SynchroTrace simulation framework.
Each thread detected by SynchroTraceGen is given its own output trace file, named sigil.events-#.out
.
By default, the output is directly compressed since the trace files can grow very large.
Options¶
Frontend Documentation¶
Each frontend generates one or more event streams to a Sigil2 backend analysis tool. Each frontend has it’s own internal representation (IR) of events, so the process of converting frontend IR to Sigil2 event primitives is different for each frontend. For example, Valgrind will disassemble each machine instruction into multiple VEX IR statements and expressions; DynamoRIO annotates each instruction in a basic block with specific attributes; the current Perf frontend only supports x86_64 decoding via the Intel XED library.
Valgrind¶
Synopsis¶
$ bin/sigil2 --frontend=valgrind OPTIONS --backend=BACKEND --executable=mybinary -myoptions
Description¶
Uses a heavily modified Callgrind tool, Sigrind, to observe Prism event primitives and pass them to the backend. Valgrind serializes all threads in the target executable, so only one thread’s event stream is passed to the backend at a time. A context switch is signaled with a Prism context event. Because threads are serialized by Valgrind, the target executable is mostly deterministic.
Options¶
Multithreaded Application Support¶
The Valgrind frontend automatically supports synchronization events in applications that use the POSIX threads library and/or the OpenMP library by intercepting relevant API calls.
Pthreads¶
Pthreads should be supported for most versions of GCC/libc, because the Pthread API is quite stable.
Pthreads support exists for any application dynamically linked to the Pthreads library.
See Static Library Support for applicatons that are statically linked.
OpenMP¶
Only GCC 4.9.2 is officially supported for synchronization event capture, because the implementation of the library is more likely to change between GCC versions.
Dynamically linked OpenMP applications are not supported. Only Static Library Support exists.
Static Library Support¶
Applications that use a static Pthreads or OpenMP library must be manually linked with the
sigil2-valgrind wrapper archive.
This can be found in BUILD_DIR/bin/libsglwrapper.a
.
For example:
$CC $CFLAGS main.c -Wl,--whole-archive $BUILD_DIR/bin/libsglwrapper.a -Wl,--no-whole-archive
DynamoRIO¶
Synopsis¶
$ bin/sigil2 --num-threads=N --frontend=dynamorio OPTIONS --backend=BACKEND --executable=mybinary -myoptions
Description¶
Note
-DDYNAMORIO_ENABLE=ON must be passed to cmake during configuration to build with DynamoRIO support.
DynamoRIO is a cross-platform dynamic binary instrumentation tool. DynamoRIO runs multithreaded applications natively. This makes results less reproducible than Valgrind, however analysis is potentially faster on a multi-core architecture. This enables multiple event streams to be processed at once, by setting –num-threads > 1.
Intel Process Trace¶
Synopsis¶
$ bin/sigil2 --frontend=perf --backend=BACKEND --executable=perf.data
Description¶
Note
-DPERF_ENABLE=ON must be passed to cmake during configuration to build with Perf PT support.
Intel Process Trace is a new CPU feature available on Intel processors that are Broadwell or more recent. The trace is captured via branch results. The entire trace is then reconstructed by perf by replaying the binary, including all shared library loading and context switches. A side effect of only capturing branch results is that all runtime information within the trace is lost, such as some memory access addresses; e.g. the Perf ‘replay’ mechanism does not support replaying malloc results.
For more usage details, see: perf design document for Intel PT
For more technical details see: Intel Software Developer’s Manual Volume Three
Options¶
Note
The perf.data
file is generated with: perf record -e intel_pt//u ./myexec
If you receive ‘AUX data lost N times out of M!‘, try increasing the size of the AUX
buffer. Otherwise a significant of the portion of the trace may not be reproduced:
perf record -m,AUXTRACE_PAGES -e intel_pt//u ./myexec
Todo
options
About¶
Prism comes from Drexel University’s VLSI & Architecture Lab (VANDAL), headed by Dr. Baris Taskin and in collaboration with Tufts University’s Dr. Mark Hempstead.
The goal of Prism is modular application analysis. It was formed from the need to support multiple projects that study application traces, aimed at data-driven architecture design. This has included early hardware accelerator co-design [SIGIL], as well as uncore design space exploration with multi-threaded workloads [SYNCHROTRACE] [UNCORERPD]. Prism is not interested in changing the functional behavior of an application, but instead aims to classify events in the application and present those events for further analysis. In this way, Prism does not require that each researcher have an in depth understanding of the binary instrumentation tools.
Changelog¶
Versioning is generally based on semantic versioning.
1.0.0 (2018-4-9)¶
User Notes¶
Initial release for our ISPASS‘18 publication.
Events¶
Control Flow events are not currently implemented, although a provisional interface is provided in Events Documentation.
Frontends¶
Valgrind is fairly well supported. It can be slow at event generation, although a faster version is in the works.
DynamoRIO is less well supported, but should generate basic memory, compute, and synchronization events fairly well. This is planned to be updated.
The Intel PT perf frontend was implemented as a proof-of-concept and there is a lot of room for increased event support and optimization.
Developer Notes¶
A new Valgrind implementation (gengrind) is close to being completed. Currently we are working on how best to implement branching with VEX IR, which seems to only support limited branching. The function tracking component of gengrind is based off of Callgrind. A few bugs may be present in this new function tracking since we stripped away some of the Callgrind specific functionality, such as cost-centers and cache simulations.
The DynamoRIO frontend requires some extra event checks to make sure the raw instructions it sees are properly binned. Additionally, some restrictions in its internal lock implementations make detecting and generating synchronization events more costly than we would expect. Specifically, we cannot directly generate events in function intercepts, and must instead set a flag that gets checked in every basic block. Also, we plan to look into thread-private code-caches to optimize ROI checks that happen at the beginning of each basic block.
Features¶
- Flexible application analysis
- Use multiple frontends for capturing software workloads like Valgrind and DynamoRIO
- Use custom C++14 libraries for analyzing event streams
- Platform-independent events
- Straight-forward and extensible format, simplifying analysis
Installation¶
See the Quickstart for information installation instructions.
Contribute¶
Source Code: https://github.com/vandal/prism
Issue Tracker: https://github.com/vandal/prism/issues
License¶
This project is licensed under the BSD3 license.