j3tracey/MGen: Tools for generating realistic messenger network traffic. @ 40c987bf2cf16c61da5a6cc7a7c344818674bbec

Tools for generating realistic messenger network traffic.

Justin Tracey 40c987bf2c add rustfmt.toml		1 year ago
.cargo	4f6f4d9084 always use native CPU target	1 year ago
shadow	8c852bbd4b make peer configs per-user, not per-conversation	1 year ago
src	8c852bbd4b make peer configs per-user, not per-conversation	1 year ago
Cargo.toml	a08eda471f add globbing support	1 year ago
README.md	a77581b348 add support for p2p clients (built as "peer")	1 year ago
rustfmt.toml	40c987bf2c add rustfmt.toml	1 year ago

MGen

MGen is a client, server, and library for generating simulated messenger traffic. It is designed for use analogous to (and likely in conjunction with) TGen, but for simulating traffic generated from communications in messenger apps, such as Signal or WhatsApp, rather than web traffic or file downloads. Notably, this allows for studying network traffic properties of messenger apps in Shadow.

Like TGen, MGen can create message flows built around Markov models. Unlike TGen, these models are expressly designed with user activity in messenger clients in mind. These messages can be relayed through a central server, which can handle group messages (i.e., traffic that originates from one sender, but gets forwarded to multiple recipients). Alternatively, a peer-to-peer client can be used.

Clients also generate received receipts (small messages used to indicate to someone who sent a message that the recipient device has received it). These receipts can make up to half of all traffic. (Read receipts, however, are not supported.)

Usage

MGen is written entirely in Rust, and is built like most pure Rust projects. If you have a working Rust install with Cargo, you can build the client and server with cargo build. Normal cargo features apply---e.g., use the --release flag to enable a larger set of compiler optimizations. The server can be built and executed with cargo run --bin server, the client (for use with client-server mode) with cargo run --bin client [config.toml]..., and the peer (for use with peer-to-peer mode) with cargo run --bin peer [config.toml].... Alternatively, you can run the executables directly from the respective target directory (e.g., ./target/release/server).

Client Configuration

Clients are designed to simulate one conversation per configuration file. Part of this configuration is the user sending messages in this conversation---similar to techniques used in TGen, a single client instance can simulate traffic of many individual users. The following example configuration with explanatory comments should be enough to understand almost everything you need:

# client-conversation.toml

# A name used for logs and to create unique circuits for each user on a client.
user = "Alice"

# A name used for logs and to create unique circuits for each conversation,
# even when two chats share the same participants.
group = "group1"

# The list of participants, except the sender.
recipients = ["Bob", "Carol", "Dave"]

# The <ip>:<port> of the socks5 proxy to connect through.
socks = "127.0.0.1:9050"

# The <address>:<port> of the message server, where <address> is an IP or onion address.
server = "insert.ip.or.onion:6397"

# The number of seconds to wait until the client starts sending messages.
# This should be long enough that all clients have had time to start (sending
# messages to a client that isn't registered on the server is a fatal error),
# but short enough all conversations will have started by the experiment start.
bootstrap = 5.0

# The number of seconds to wait after a network failure before retrying.
retry = 5.0


# Parameters for distributions used by the Markov model.
[distributions]

# Probabilities of Idle to Active transition after sending/receiving messages.
s = 0.5
r = 0.1

# The distribution of message sizes, as measured in padding blocks.
m = {distribution = "Poisson", lambda = 1.0}

# Distribution I, the amount of time Idle before sending a message.
i = {distribution = "Normal", mean = 30.0, std_dev = 100.0}

# Distribution W, the amount of time Active without sending or receiving
# messages to transition to Idle.
w = {distribution = "Uniform", low = 0.0, high = 90.0}

# Distribution A_{s/r}, the amount of time Active since last sent/received
# message until the client sends a message.
a_s = {distribution = "Exp", lambda = 2.0}
a_r = {distribution = "Pareto", scale = 1.0, shape = 3.0}

The client currently supports five probability distributions for message timings: Normal and LogNormal, Uniform, Exp(onential), and Pareto. The parameter names can be found in the example above. The distributions are sampled to return a double-precision floating point number of seconds. The particular distributions and parameters used in the example are for demonstration purposes only, they have no relationship to empirical conversation behaviors. When sampling, values below zero are clamped to 0---e.g., the i distribution above will have an outsize probability of yielding 0.0 seconds, instead of redistributing weight. Any distribution in the rand_distr crate would be simple to add support for. Distributions not in that crate can also be supported, but would require implementing.

Peer configuration

Running in peer-to-peer mode is very similar to running a client. The only differences are that users and recipients consist of a name and address each, and there is no server. Here is an example peer conversation configuration (again, all values are for demonstration purposes only):

# peer-conversation.toml

user = {name = "Alice", address = "127.0.0.1:6397"}
group = "group1"
recipients = [{name = "Bob", address = "insert.ip.or.onion:6397"}]
socks = "127.0.0.1:9050"
bootstrap = 5.0
retry = 5.0

[distributions]
s = 0.5
r = 0.1
m = {distribution = "Poisson", lambda = 1.0}
i = {distribution = "Normal", mean = 30.0, std_dev = 100.0}
w = {distribution = "Normal", mean = 30.0, std_dev = 30.0}
a_s = {distribution = "Normal", mean = 10.0, std_dev = 5.0}
a_r = {distribution = "Normal", mean = 10.0, std_dev = 5.0}

In the likely case that these peers are connecting via onion addresses, you must configure each torrc file to match with each peer configuration (in the above example, Alice's HiddenService lines in the torrc must have a HiddenServicePort line that forwards to 127.0.0.1:6397, and Bob's torrc must have a HiddenServicePort line that listens on 6397). Multiple users can share an onion address by using different ports (different cirtuits will be used).

README.md

MGen

Usage

Client Configuration

Peer configuration